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Preface 



This volume, Prospects for hardware foundations, explores the theoretical 
foundations of hardware design. It contains twelve papers on 

(i) mathematical foundations of hardware modelling; 

(a) models of hardware and dynamical systems; and 
(Hi) verification and deductive design of systems. 

The papers investigate some of the problems at the heart of our theoretical 
understanding of hardware systems, their design and their integration 
with other physical or biological systems. The volume aims to make a 
conceptual contribution to the theory of hardware and to offer prospects 
for its development. 

Specifically, the articles address theoretical topics, including: stream 
processing, spatially extended systems, hierarchical structures, integra- 
tion of analogue and digital models. There are case studies of super- 
scalar processors, the Java Virtual Machine, and biological excitable me- 
dia. There are design and verification techniques including higher order 
verification, process algebra, state charts, simulation and reasoning of 
analogue models. Also there are reflections on constructs for future gen- 
eration hardware description languages. 

This volume is also a scientific memoir of the NAD A Working Group, 
the ESPRIT Basic Research Action 8533. The Action existed over the 
period April 1994 - October 1997. The Action brought together nine 
research groups, with interests in theoretical computer science, math- 
ematical logic, formal methods for system design, dynamical systems, 
and hardware, to pursue a multidisciplinary research programme in the 
foundations of hardware. It held five general meetings and four specialist 
workshops, at which the groups met together, with some invited guests, 
for intensive exchanges; it also sponsored several visits between sites. The 
introduction to this volume gives further information about NADA; here 
we describe its origins and scientific purpose. 

NADA 

At the NATO Summer School at Marktoberdorf on Logic and algebra of 
specification, July 23 - August 4, 1991^, the second of the foundations- 

^ F L Bauer, W Brauer andH Schwichtenberg (eds.), Logic and algebra of specification, 
NATO AST Series F, Vol. 94, Springer- Verlag 1993 
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oriented “Blue Series” of this distinguished institution, there was enthu- 
siastic discussion of hardware systems by people who were studying the- 
oretical aspects of hardware, or were drawn to hardware systems in their 
work on design and verification. There was excitement about the wealth 
of problems and the possibilities of solving them. Discussions between 
John Tucker, Helmut Schwichtenberg, Hans Leiss, Bernhard Mdller, Wal- 
ter Dosch, Carlos Delgado Kloos and Manfred Broy created a common vi- 
sion of a wide-ranging collaborative study of hardware systems, integrat- 
ing knowledge of theoretical computer science, mathematical logic, formal 
methods, and hardware systems. After the Summer School, Jan Bergstra, 
Viggo Stoltenberg-Hansen and Arun Holden completed the team. Our 
group wanted to collaborate on research that might 

(i) reveal the essential scientific structure of hardware systems; 

(a) shape a future generation of hardware description languages; 

(Hi) produce new mathematical methods for design and verification; 

(iv) yield interesting theoretical and mathematical problems; and 

(v) perform advanced case studies. 

A first proposal for a Basic Research Action in the ESPRIT Programme, 
in October 1991, was rewarded with polite comments from referees and no 
funds. Undeterred and keen to collaborate, and with Keith Hanna joining 
the team, the revised application succeeded: the NADA Basic Research 
Action Working Group 8533 was awarded in 1993, with one of us (BM) as 
Coordinator, and held its inaugural meeting in April 1994 at TU Munich. 

The aim of the Action was to collaborate in research on new, math- 
ematically sound methods for the description and design of hardware 
systems. We interpreted the term “hardware systems” very generally to 
include circuits, architectures and the hardware/software interface. More 
controversially, we also included the interface between hardware and phys- 
ical and biological systems. 

One goal was the search for a next generation hardware description 
language having a high level of abstraction and a clean, formally defined 
semantics. NADA was to analyse the requirements for such an idealised 
language which was called NIL. Description aspects included general ques- 
tions of timing, parameterisation and modularisation. The design tech- 
niques included verification, deductive design in the small, and structured 
design in the large. 

The goal of the research on modelling hardware and dynamical sys- 
tems was to elicit requirements on design methodologies and description 
languages. Architectures, circuits, and emerging new paradigms for hard- 
ware systems were studied, as well as various standard technologies, in the 
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search for unified mathematical models of hardware. Representative case 
studies were also needed for demonstrations of the developed techniques. 

The goal of the research on algebraic and logical foundations for hard- 
ware design was to support the above tasks. Appropriate mathematical 
methods were taken from computability theory, algebraic specifications, 
higher order algebra, proof theory and process algebra. 

In publishing this volume we wish to bring together some of our re- 
sults and make available our agenda and approach. It is interesting to 
reflect on progress in the theory of hardware. This volume may be com- 
pared with, for example, a volume^ edited by one of us (JVT), almost ten 
years ago, on the then current state of hardware foundations. There has 
been clear progress on most fronts: mathematical tools, semantic frame- 
works, verification and specification techniques, deductive methods, and 
complexity of case studies have all been advanced. However, we are far 
from completing the important scientific task of creating a comprehensive 
theory of hardware. 
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Abstract. This introductory chapter provides a brief survey of the his- 
tory and accomplishments of NADA. An extended section presents the 
recommendations of the NADA Group on future hardware description 
languages. Finally, the contents of the remaining chapters are briefly in- 
troduced. The chapter also lists some references to NADA-relevant work 
of those NADA members that have not contributed to one of the further 
chapters. 



1 A Survey of Esprit Working Group 8533 NADA - New 
Hardware Design Methods 

1.1 Aims and Scope 

The topic of NADA was research on new, mathematically sound methods for 
the description and design of hardware systems. The term “hardware systems” 
was interpreted very generally to include architectures and circuits, the hard- 
ware/software interface and even biological systems. Three major subtopics were 
identified within the Inaugural Meeting in April 1994: 

— modeling; 

— hardware description languages and design techniques; 

— foundations. 

The goal of the research on modeling was to elicit requirements on design 
methodologies and description languages. It included studying architectures, cir- 
cuits and emerging new paradigms for hardware systems, as well as various stan- 
dard technologies. 

The investigations around NIL (“NADA Integrated Language”), a fictitious 
next generation hardware description language, served to exhibit hardware spec- 
ification and description concepts at a high level of abstraction for which a clean 
formal semantics can be given. The requirements on such a language were dis- 
tilled out of the extensive body of case studies the NADA Group has performed. 

The investigated design techniques included verification and deductive de- 
sign. 

The Group was also performing research on the mathematical foundations 
for hardware design. Appropriate mathematical methods were taken from com- 
putation theory, higher order algebra, proof theory, relation algebra and timed 
process algebra. 



B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 1-26, 1998. 
Springer-Verlag Berlin Heidelberg 1998 




2 The NADA Group 

1.2 The NADA Participants 

We give a table of the NADA participants as of the official project termination 
in October 1997. The University of Augsburg served as the Prime Contractor 
with B. Moller as the Coordinator of the Group. The responsible researcher at 
each site is marked by italic type font. 



University of Amsterdam 


Jan A. Bergstra, Alban Ponse 


University of Augsburg 


Bernhard Moller 


University of Kent 


F. Keith Hanna 


University of Leeds 


Arun V. Holden 


Medical University of Liibeck 


Walter Doseh 


University Carlos III of Madrid 


Carlos Delgado Kloos, Peter T. 
Breuer, Natividad Martinez Madrid, 
Luis Sanchez Fernandez 


Ludwigs-Maximilian-University 


Helmut Sehwiehtenberg, Hans Leiss, 


Munich 


Ulrich Berger 


Technical University Munich 


Manfred Broy, Peter Scholz, 
Jan Philipps 


Royal Technical Highschool 
Stockholm 


Karl Meinke 


University of Wales Swansea 


John V. Tueker, Neal A. Harman, 
Matthew J. Poole, Karen Stephenson 


Uppsala University 


Viggo Stoltenberg-Hansen 



Most of these participants have been with the Group from the very beginning. 
Some of them have moved locations during the duration of NADA: the Span- 
ish partners switched from the ETSIT Madrid to their current university, W. 
Dosch left Augsburg to move to Liibeck and K. Meinke changed from Swansea 
to Stockholm. Earlier participants were J. Brunekreef at Amsterdam, M. Fuchs 
at TU Munich, B.C. Thompson at Swansea and P. Abdullah at Uppsala. 

1.3 Main Overall Achievements 

The first year was dominated by case studies in description and modeling as 
well as in verification and deductive design, building on and extending the part- 
ners’ previous work, mainly concentrating on aspects in the small. Goncerning 
NIL, the Group came up with a list of desirable language concepts and explored 
the possibilities of representing them within the generic theorem prover Isabelle; 
moreover, the selection of the language concepts was based on their suitability 
for the methodology of deductive design as employed in various case studies. 
On the side of foundations, a central theme was the use of streams based on 
various formal definitions of that concept. Moreover, there were investigations 
on modeling Synchronous Concurrent Algorithms (SCAs) using process algebra, 
effective algebras, models of time, tool-independent representation of algorithms 
and hardware, a framework for hardware-software codesign and technology in- 
dependent specifications. 
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The second year was dominated by studies in modeling, on foundational 
issues and by further case studies. Concerning language issues, the Group started 
exploring MHDL for possibilities of serving as a basis for NIL. It turned out that 
many of the concepts exhibited in Year I were already accommodated there, 
although a coherent semantic framework was missing. 

The third year significantly advanced the state of the art in the areas of mod- 
eling, deductive design and foundations. Concerning language issues, there was a 
severe backdrop imposed on the Group: the development of the language MHDL 
it had started exploring as a possible basis for NIL was abandoned. Meanwhile 
a new development is under way: SLDL (System level description language). 
However, this language is still in its requirements definition phase and so could 
not be used within the project duration. The Group’s recommendations on NIL 
are detailed in Section 3. 



1.4 Workshops and Conferences 

Next to the NADA Inaugural Meeting in April ’94, hosted by TU Munich, the for- 
mal milestones for NADA were the three Annual Meetings, organized by Swansea 
in March 1995, Uppsala in April 1996 and Madrid in April 1997, as well as the 
Final Meeting, organized by Lfibeck in September 1997. In between these there 
were specialist workshops, viz. one on first case studies and an exploration of lan- 
guage concepts, organized by Madrid in October 1994, one on the central topic 
of streams, organized by LMU Munich in October 1995, and one on further case 
studies and language issues, organized by Amsterdam in October 1996. 

Besides these NADA-specific meetings, NADA partners were also concerned 
with the organization of related conferences. 

A lot of important material in the realm of deductive design was presented at 
the Third International Conference on The Mathematics of Program Construc- 
tion, Kloster Irsee, Germany, July 17-21, 1995 [19,20], which was chaired and 
organized by NADA member B. Moller and partially sponsored by NADA. 

On the side of foundations, two major events were the Second and Third 
International Workshops on Higher Order Algebra, Logic and Term Rewriting. 
The former of these took place at Paderborn, Sept. 21-22, 1995 [6], sponsored 
by the EAGSL and with NADA members K. Meinke and B. Moller in the or- 
ganizing committee. The latter of the workshops was held in conjunction with 
ALP (Algebra and Logic Programming) Southampton, Sept. 2-5, 1997 [8]. It 
was sponsored by the EAGSL, and NADA members K. Meinke and B. Moller 
were again part of the organizing committee. 

Finally, the NADA group had the opportunity to present its work to a broad 
audience at the Third Annual NADA Meeting held in conjunction with the con- 
ferences CHDL/VUFE/LCMQA ’97 at Toledo, 20-25 April, 1997, and organized 
by NADA member C. Delgado Kloos. In particular, a tutorial on four central 
threads of NADA research was presented. All this was well-received and gave 
NADA considerable international visibility. 
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2 Project Achievements of the Individual Sites 

2.1 Amsterdam 

Amsterdam’s participation in NADA triggered a lot of design and specification 
activity, ranging from refinement of the concept early input to design and appli- 
cation of Grid Protocols. Early input is a way of passing values between concur- 
rent processes; it is primitive to formalize and used to model stream processing. 
Amsterdam’s participation in NADA was driven by two general goals: 

— To find simple and adequate primitives, and elegant algebraic methods to 
model data transfer between concurrent processes, and 

— To give a process algebraic account of applications that are relevant to 
NADA, viz. the modeling of hardware phenomena, and development of re- 
spective analysis techniques. 

As an example we mention the Amsterdam work on SCAs (Synchronous Con- 
current Algorithms) as developed by Tucker et al. The following achievements 
of Amsterdam’s participation in NADA can be distinguished: 

— A better understanding of scientific work done elsewhere. This applies in 
particular to the modeling of hardware and other concurrent phenomena in 
which parallel input /output is a major ingredient. 

— Much more focus on a systematic treatment of data manipulation in process 
algebra, as exemplified by the early input concept. Traditionally, process al- 
gebraic modeling often concentrated on operational aspects of concurrency, 
and treatment of data manipulation (by processes) was ad hoc or left im- 
plicit. 

— Distinguishing primitives for file-transfer; further research on an algebraic 
treatment of (concurrent) assignment is being undertaken. 

— A close observation of the development of /iCRL (“process algebra with 
data”) at CWI. 

— Research on process algebra with recursive operations, such as binary Kleene 
star and nesting, in particular in combination with early input and value- 
passing. 



2.2 Augsburg 

The focus of the work at Augsburg has been on the method of Deductive Design, 
i.e., on the systematic construction of a system implementation, starting from 
its behavioural specification, according to formal, provably correct rules. The 
main advantages of this approach are: 

— The resulting implementation is correct by construction. 

— The rules can be formulated schematically, independent of the particular ap- 
plication area, and hence can be re-used for wide classes of similar problems. 

— Since the rules are formal, the design process can be assisted by machine. 
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— Implementations can be constructed in a modular way, with emphasis on 
correctness first and subsequent transformation to increase performance. 

— The formal derivation also serves as a record of the design decisions taken 
and hence is an explanatory documentation. Upon modification of the system 
specification it eases revision of the implementation. 

This paradigm has successfully been applied to sequential and, to a lesser 
extent, also to parallel programs. Since hardware consists of “frozen” programs, 
it is an obvious idea to apply this method to hardware design. A large amount 
of work in this area has been done by M. Sheeran and others. 

The major novel ingredients and achievements in the Augsburg work are the 
following: 

— specification at the level of predicate logic, not necessarily algorithmic yet, 

— a clearer disentangling of the abstract idea of an algorithm from the concrete 
layout that realizes it, 

— in particular, introduction of wiring operators in a late stage of the deriva- 
tions, thus avoiding a lot of burden and clutter, 

— a simpler approach to retiming that avoids the concept of anti-delays, 

— in the asynchronous case a strongly algebraic approach to streams, notably 
to questions about fairness. 

While the first two items had been established already in the CIP group to 
which NADA members B. Moller and W. Dosch belonged, the latter three items 
were elaborated within the NADA project. 

The case studies include many of the IFIP WG10.5 Benchmark Verification 
Problems [7]. Next to dealing with basic combinational and sequential circuits, 
a very simple treatment of systolic circuits was achieved. Finally, concerning 
higher- level hardware concepts, an easy formal account of pipelining became 
possible. 

Special emphasis was laid on parameterization and re-usability aspects. 

In the synchronous case, a major breakthrough was the switch to formalizing 
the specifications and derivations using the functional programming language 
Gofer, a subset of Haskell. The polymorphism of this language allows the use 
of §tefanescu’s network algebra and other algebraic laws both at the level of 
combinational and sequential circuits. Also fixpoint induction and related proof 
principles can be applied directly. Moreover, many derivations can be performed 
in a polymorphic way abstracting from concrete applications and hence achieving 
much better re-usability. 

In the asynchronous case, the algebraic basis has been consolidated. This 
thread of work concerns more algebraic ways of specifying, reasoning about and 
transforming descriptions of (sets of) streams. Particular emphasis is laid on the 
use of regular algebra in describing stream patterns, notably in connection with 
questions of fairness. This algebraic approach has also been tied in with other 
specification formalisms such as temporal and modal logic. 
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2.3 Kent 

The work undertaken at Kent has focused on two themes: the use of dependent 
types in specifications for digital systems and in extending formal verification 
from the digital domain towards the analog domain. 

The motivation for using dependent types (loosely, types that are parameter- 
ized by values and/or other types) for formal specification is that they offer both 
greater expressiveness and greater generality. During the course of the NADA 
project, Kent have: 

— Carried out an in-depth investigation of the use of dependent types in spec- 
ifying and in formally verifying complex systolic array architectures; 

— Shown how it is possible to capture, within a formal theory, the interrelation 
between the structural and the behavioural aspects of a circuit. Circuits are 
treated as typed graphs and their behavioural specifications as predicates. 
The key difficulty that has to be overcome is that, in a structural description, 
the types of ports have to be treated as values whereas, in the behavioural 
description, they appear as types. Using dependent types, it was found pos- 
sible to devise a sufficiently expressive type for interfaces that guaranteed 
the consistency of structural and behavioural specifications. The approach 
was demonstrated using Veritas, an implementation of a dependently- typed 
higher-order logic. 

The motivation for extending formal specification and verification techniques 
from the digital domain towards the analog domain is that, in practice, significant 
sections of a digital system often have to be designed at the analog electronics 
level of abstraction. By doing this, logically redundant signal conditioning and 
buffering stages can be eliminated, leading to circuitry which can be up to an 
order of magnitude faster, and with lower power consumption, than the corre- 
sponding circuitry designed (in terms of gates and fiipfiops, etc) at the digital 
level of abstraction. During the course of the NADA project, Kent have: 

— Demonstrated how, using higher-order logic, the behavioural characteristics 
of typical electronic components (transistors, diodes, resistors) can be spec- 
ified in terms of analog voltages and currents. 

— Shown how the behavioural specification of an electronic circuit (for ex- 
ample, one that uses pass-transistor logic) can be inferred from the analog 
specifications of the component parts and their interconnections and how 
specifications at the analog and digital levels are related. 

— Devised and implemented a decision procedure that, given a description 
of an analog circuit and behavioural specifications of its component parts, 
determines whether it correctly implements a given behavioural specification. 

Kent believe that this work is likely to impact strongly upon the design of future 
generations of mixed-level hardware design languages which, at present, lack 
rigorous foundations. 
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2.4 Leeds 

The focus of the work at Leeds has been on the development and evaluation of 
case studies, based on biomedical applications, that are practical illustrations 
of what current problems exist in the computational modeling of complicated, 
highly structured, spatially extended systems that are represented mathemat- 
ically by different classes of models - cellular automata, coupled map lattices, 
coupled ordinary differential equation lattices, and partial differential equations, 
and that interact with data streams. The main idea of the approach is to use the 
theory of synchronous concurrent algorithms as a framework within which dif- 
ferent types of model are embedded, and can be coupled. The advantage of this 
modular approach is that newer generation models can readily be incorporated, 
and the framework is applicable to modeling a wide range of hierarchical, struc- 
tured system, not just the case studies, based on cardiac muscle, arryhthmias, 
and their control. The principal results of the Leeds work have been: 

— The development of a family of partial differential equation models for dif- 
ferent normal and pathological cardiac tissues, and the use of these models 
to simulate cardiac arryhthmias. 

— The design of a novel means of controlling re-entrant cardiac arryhthmias, 
and the evaluation of its practicability. 

— The development of a geometric, anisotropic model for the ventricles of the 
heart, and the coupling of this geometry with simple (coupled map lattice, 
simple ordinary differential equation models) of excitability and with partial 
differential equation models. 

— The development of a formal approach, within the theory of synchronous 
concurrent algorithms, for coupling and verifying the above models. 

— The application of the same approach to other biological systems - neural 
nets, and oceanic plankton population dynamics. 

This approach has been primarily in collaboration with the Swansea group, 
and has led to extensive funding from UK research council and medical charities, 
in the applications of these models to clinical problems. It has also led to UK 
funding to continue the collaboration with the Swansea group. 

2.5 Liibeck 

The work in Liibeck concentrated on deductive design — both in the area of 
hardware and software [5]. This methodology aims at systematically deriving a 
system implementation from its behavioural specification following sound trans- 
formation rules. The derivation documents the design decisions and guides the 
redesign upon changing requirements. Within a structured design methodology, 
one initially abstracts from layout and timing issues and concentrate on algo- 
rithmic design principles. Synchronous descriptions are based on the algebra of 
finite sequences modeling the linear succession in space or in time; asynchronous 
descriptions employ stream processing functions. The major contributions in the 
work of Liibeck consist in 
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— isolating and formalizing important design steps [3], 

— abstracting characteristic design patterns from specific case studies, 

— deriving standard implementations for iterative and tree structured networks 
using higher-order functions, 

— reasoning about ordered streams by set abstraction [4]. 

For a class of digit recurrence algorithms, W. Dosch has derived different 
implementations as combinational and sequential circuits in a schematic way. 
This class of iterative circuits was then generalized to tree structured networks 
originating from cascade recursive functions. The derivations are parameterized 
supporting their re-use in different applications. 

2.6 Madrid 

The Madrid work has been concentrated in three main areas. The first area is 
the semantics of hardware description languages, chiefly in connection with the 
formalization of VHDL and the development of specification, refinement and 
verification calculi for that language and real-time systems in general. Secondly, 
Madrid have been working with the specification language LOTOS. They have 
produced methods for designing hardware systems and generating VHDL code 
automatically from the design. Thirdly, Madrid have been researching in the 
area of codesign also using LOTOS as the system description language. 

Madrid have given several presentations to the Group about the last two 
areas and have been actively developing the first area within the project itself. 
Participation in NADA has been specially useful to Madrid in the further devel- 
opment of their logic for real-time system design from their approach to VHDL 
semantics. With regard to the latter, Madrid have debugged the refinement cal- 
culus of VHDL, finally proving it to be complete with respect to their formal 
semantics for the language. Madrid have also taken the opportunity to investi- 
gate the problems of design in non-discrete regimes using their formalisms and 
those of others. The operational amplifier has been a particularly instructive 
example of an analog device operating in continuous time. From their attempts 
to describe it, Madrid have concluded that compositionality under these regimes 
derives from the constraints on a system and not from the solutions to (subsets 
of) those constraints. If one tries the latter approach one runs into problems 
with causality requirements. As a result of these investigations, Madrid have 
been able to suggest a semantic model for the analog extension to VHDL (now 
accepted by the IEEE), developing a prototype implementation. 

2.7 LMU Munich 

LMU have worked on a variety of subjects all subsumable under the heading 
‘applied proof theory’. This was done in conjunction with the development of an 
interactive prover MINLOG, well suited to the purpose and, in particular, for 
hardware verification and development. Most notably U. Berger and M. Eberl 
have been working on NADA-related questions. 
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LMU work in the area of program extraction from classical proofs. To this 
end they apply H. Friedman’s ^-translation followed by a modified realizability 
interpretation. However, to obtain a reasonable program it is essential to use a 
refinement of the ^-translation introduced in Berger /Schwichtenberg 1995. This 
refinement makes it possible that not all atoms in the proof are ^-translated, 
but only those with a “critical” relation symbol. Yannis Moschovakis suggested 
the following example of a classical existence proof with a quantifier-free kernel 
which does not obviously contain an algorithm: the gcd of two natural numbers 
a\ and 02 is a linear combination of the two. In that example only the divisibility 
relation - I- turns out to be critical. 

In addition, LMU worked on case studies in hardware description and verifi- 
cation, among which is the min/max-example from the IFIP WG 10.5 Hardware 
Verification Benchmarks [7]. 

2.8 TU Munich 

In the past years, the design methodology Focus for distributed systems has been 
developed at TUM. As the original intention of Focus was to describe software 
systems, the NADA project has been a unique opportunity to verify to which 
extent Focus is applicable for the specification of hardware, too. To achieve this, 
a number of case studies, like the formal development of a production cell, a 
distributed min/max component, interfaces, modulo-n counter, and the Alpha 
AXP'^^ shared memory have been carried out using Focus. 

Apart from these case studies, which have shown that Focus is indeed suit- 
able for the description of hardware oriented systems, it also turned out that 
Focus provides a mathematical framework for defining a formal semantics for 
hardware description languages like VHDL and system level languages like Stat- 
echarts. Furthermore, it was proven that Focus is not only appropriate to de- 
scribe pure software or hardware components but in addition can be used to 
specify mixed hardware/software systems, i.e. can serve as a formal foundation 
for Hardware/Software Codesign. 

2.9 Stockholm 

Foundations of Hardware Specification K. Meinke has investigated the 
foundations of hardware specification using higher-order algebra. Roughly this 
corresponds with the equational fragment of many-sorted higher-order logic, 
where higher types are function types. Thus it is related to Church’s system of 
finite types. This framework is natural for hardware description since it allows 
the direct representation of hardware devices as stream transformers. This obser- 
vation has also been confirmed elsewhere, e.g. the HOL community. Significant 
achievements in this area have been: 

— An exact characterization of the specification power of higher-order equa- 
tions [14]. Earlier work on this problem was published in [12]. This power 

was shown to be III, thus it properly includes the arithmetical hierarchy. 
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Furthermore, the expressive power of the hierarchy of specifications (second- 
order, third-order, etc.) collapses to second-order. In this sense, (quantifier 
free) second-order equational logic is slightly more powerful than first-order 
arithmetic. 

— A proof theory for higher-order equations which exactly characterizes the 
case where higher-order equational reasoning is reducible to first-order equa- 
tional reasoning through the existence of normal form proofs for higher- 
order equations. This result makes use of a particularly interesting topol- 
ogy on higher-order algebras which seems to capture a definition of obser- 
vational equivalence for higher-order operations (e.g. stream transformers). 
Case study research [16] on Kung’s systolic convolution algorithm found this 
to be a useful result for theorem proving and hardware verification. 



Practical Case Studies Together with L.J. Steggles, K. Meinke studied some 
well known hardware algorithms in order to gain insight into their specification 
and verification requirements. These included a typical systolic algorithm (con- 
volution) and a typical dataflow algorithm (Hamming Stream) . The convolution 
algorithm lead to fundamental insights in the proof theory of verification for 
systolic algorithms (see above). The dataflow algorithm revealed the power of 
higher-order algebra for semantical verification. 

L.J. Steggles extended the theory of higher-order algebraic specification to 
transfinite types, and showed that this formed a suitable framework to capture 
the parametricity of families of hardware algorithms [22]. He also developed 
a theory of parameterized higher-order algebraic specifications, which allowed 
taking advantage of the polymorphism inherent in the convolution algorithm to 
simplify the specification [23]. 

K. Meinke investigated the impact of the proof theory of higher-order equa- 
tions on higher-order equational theorem proving, and verification of hardware 
by means of term rewriting in [13]. 



Software Tools Together with B.M. Hearn, K. Meinke designed and imple- 
mented a tool for parsing and executing higher-order algebraic specifications by 
rewriting of terms and types. This tool was based on earlier published research of 
K. Meinke. The tool and its input language is called ATLAS (A Typed Language 
for Algebraic Specification) and is based on many-level algebraic specifications, 
which can be used for equational specification both of data and types (by means 
of type equations) . ATLAS [9] is implemented in C under UNIX and has a Motif 
graphical user interface. ATLAS was used to specify and verify hardware algo- 
rithms, as well as other case studies, including order-sorted, polymorphic and 
recursive types. 

The research on ATLAS was also taken further by E. Visser at the CWI 
Amsterdam (outside the NADA project). 
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2.10 Swansea 

Algebraic Models of Microprocessors Work on microprocessor modeling 
has progressed from basic representations of simple processors, at the level of 
the programmer, through models of microprogrammed implementations, to mod- 
els of advanced processor implementations. In addition, considerable attention 
has been given to the problems of verifying the correctness of microprocessor 
implementations . 

At the start of the NADA project, iterated map microprocessor models were 
restricted to a simple PDP-8-derived example, at the level of the programmer. 
An implementation of the PDP-8 was developed, and in parallel a technique 
was developed for substantially simplifying the process of formal verification. 
This required that a number of conditions be met, that were trivially satisfied 
by iterated map models of microprocessors, and their associated timing abstrac- 
tion functions. A more sophisticated example, with input and output, was then 
developed. This was based on a standard example, Gordon’s Computer. 

The substantial body of work on microprocessor representation has been con- 
cerned with pipelined, and especially superscalar processors. The problems that 
these bring are (a) substantially increased complexity of implementations; and 
(b) a timing model that differs substantially from that of the programmer’s level 
model, or architecture. Techniques were developed for representing superscalar 
microprocessors, and existing correctness conditions were modified. Work con- 
tinues on developing techniques for simplifying the verification of superscalar 
processors. 



An Algebraic Framework for Linking Program and Machine Semantics 

During the period 1994-1997, K. Stephenson and J.V. Tucker constructed an 
algebraic framework in which the process of executing a high-level program can 
be described at the level of hardware. There are many layers of abstraction 
involved between these two extremes. Each layer of the hierarchy consists of an 
algebraic specification for the semantics of the language at that level. To produce 
specifications for each layer requires the syntax and a semantic mapping of the 
language to be defined. In addition, to compare levels of the hierarchy and to 
define the relative correctness of levels to each other, these layers are structured 
further. 

General methods of defining syntax are based on the notion of filtering 
context-free supersets to produce algebraic specifications of non-context-free lan- 
guages. This work is set against a background of research into the relationship 
between context-free grammars and closed term algebras. 

An operational model of semantics is used in which the sequences of states 
produced by the execution of programs are explicitly clocked by time. This en- 
ables computable models of operational semantics to be developed and hence 
algebraically specified. In particular, this provides an algebraic approach to 
structural operational semantics of high-level languages. The approach is also 
applicable to the semantics of low-level languages; this work is set against a 
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background of research into machine semantics developed by J.V. Tucker and 
N.A. Harman within NADA. 

In addition, each layer is structured so that the specifications have a com- 
mon signature. This allows the process of compilation between the layers to be 
expressed as a homomorphism. Correctness of one level of the hierarchy with 
respect to another can then be stated in terms of commutative diagrams. This 
work is set against a background of research in algebraic semantics. 

It has been shown, to deal with concepts at a higher level of abstraction, that 
the act of proving correctness can be reduced to establishing the correctness of 
the commutative diagram before execution (trivially) and at the end of the first 
time cycle. As these stages do not require structural induction to be used, this 
both dramatically reduces the work required, and also makes the process feasible 
for mechanical checking by a suitable theorem prover. 

The first two levels of this hierarchy are specified and are (manually) shown 
to be correct relative to each other for a non-trivial case study of a simple 
while programming language and a very abstract machine-level language. This 
framework is now being extended to deal at lower levels of the hierarchy, with 
the aim of linking the work to that of N.A. Harman regarding the correctness of 
microprocessors. 



SCAs and Dynamical Systems Work by J.V. Tucker and M.J. Poole, in 
collaboration with A.V. Holden (University of Leeds) has focused on algebraic 
models of spatially extended computational systems. The theory of synchronous 
concurrent algorithms (SCAs) has been applied and extended to unify the mod- 
eling of a range of different types of hardware system and biological system. 

In their biological system applications. Tucker et al. have been primarily 
interested in coupled map lattice, cellular automaton, neural network, coupled 
ordinary differential equation and partial differential equation models of spatially 
extended biological phenomena. They have concentrated on models of electrical 
phenomena in cardiac tissue. On making discrete approximations of continuous 
spaces and times, the diverse range of available models of cardiac excitation are 
all examples of SCAs and may therefore be studied in a unified way, from the 
point of view of parallel deterministic computation. The notion of discrete space 
and local connectivity in cardiac models corresponds closely with the notion of 
parallel digital systems; continuous state systems are closely related to analogue 
computing systems. 

Notions of hierarchies of spatially extended systems have been developed 
within the framework of SCAs. The formal concepts of hierarchy are derived 
from notions of abstraction and approximation between the structure and be- 
haviour of two SCAs, which in turn are built upon abstractions of time, space 
and state. The general theory has been exploited in the investigation of sys- 
tolic algorithms and, especially, of spatially extended biological systems. The 
concepts of hierarchy have led to the rigorous comparison of the behaviours of 
mathematically and biologically diverse models of cardiac electrical behaviour 
that are unified by SCA theory. Further, they have led to the construction of 
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hierarchical and hybrid SCA models of cardiac activity combining many compo- 
nent models and reconstructing cardiac behaviour at many different, interacting, 
levels of biological detail. 

2.11 Uppsala 

The work of the Uppsala site has primarily been concerned with semantical 
foundations of data types, streams and stream transformers. The main tool used 
has been the theory of domains. Domain theory arose from generalizations of 
the theory of recursion and computability in the work of D. Scott, Yu. Ershov 
and others. Its use in denotational semantics is well known. Uppsala have also 
considered the use of formal spaces and of certain non-standard models. 

Uppsala’s work has been concentrated on considering domain theory as a 
theory of computation on data types or topological algebras, where approxima- 
tions are the primary objects. That is, computations are performed on concrete 
approximations and the results are then transferred to ideal or non-concrete ele- 
ments (the elements of the considered topological algebra) via a limit process. A 
domain captures this idea well. The method used is that of domain representabil- 
ity for topological algebras, developed by V. Stoltenberg-Hansen and J.V. Tucker 
starting in 1985. This general technique was applied to the processing of streams 
with discrete and continuous time and data. 

Another tool useful for the study of computability for topological structures, 
and for obtaining constructive versions of non constructive theorems such as the 
Tychonoff or Hahn-Banach theorems, is the theory of formal spaces. Uppsala 
have made a comparison between this method and the method of domain repre- 
sentability showing their equivalence in certain precise senses for regular locally 
compact spaces. Uppsala have also found a connection between domain theory 
and model theory where saturation plays a key role. The have used this to give 
a logical presentation of the Kleene-Kreisel continuous functionals. 

Finally, Uppsala have considered certain constructive non-standard models 
which they believe will be useful for foundational questions addressed by the 
NADA project. 

3 Observations on NIL — The NADA Integrated 
Language 

In this section we discuss the observations on requirements to future hardware 
description languages as they have arisen during the NADA work. 

3.1 Introduction 

The lack of a common hardware description language (HDL) with sound foun- 
dations and clear concepts hinder progress in high-level hardware design. HDLs 
in use support only the lower design level; their semantics are mostly simulator- 
based and hence awkwardly complex. 
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NIL is a fictitious next generation hardware description language. This sec- 
tion describes the concepts such a language would contain. It is based on the 
experience of extensive case studies on specification, verification and deductive 
design within NADA. Detailed reports on these case studies can be found in the 
subsequent chapters. 

NIL employs concepts at a high level of abstraction and is based on a formally 
defined semantics. Description aspects include general questions of timing, pa- 
rameterization and modularization. The design techniques include verification, 
deductive design in the small and structured design in the large. 

During the duration of the NADA Working Group its members were looking 
at the functionally based language MHDL [21] as a possible basis for NIL. Un- 
fortunately the development of that language was abandoned during the final 
phase of NADA. 

At the same time efforts towards the definition of a System Level Deseription 
Language SLDL [24] as a successor to MHDL were set up. Currently this language 
is in its requirements definition phase. Many of the aims put forward by the SLDL 
committee fall well in line with observations by NADA. Therefore we shall freely 
quote the current SLDL requirements document to underline our views. Such 
citations are marked by slight indentation and a slanted type font. The mission 
statement of the SLDL effort is as follows: 

To support the creation and/or standardization of a language or represen- 
tation, or set of languages or representations, that will enable engineers to 
describe single and multi-chip silicon-based embedded systems to any desired 
degree of detail and that will allow engineers to verify and/or validate those 
systems with appropriate tools and techniques. The languages or representa- 
tions will address systems under design and their operating environment, 
including software, mixed-signal, optical, and micro-electronic mechanical 
characteristics. 

The SLDL requirements document goes on to say: 

Note that the mission statement does not necessarily require a specific lan- 
guage; a set of languages, or a meta-notation, or something of this sort is 
perfectly acceptable as long as it meets the requirements set forth in this 
document. 

This agrees very well with the actual road the NADA work took: the different 
groups have developed various formalisms, each suited to a particular problem 
domain. Due to lack of manpower, the complete design and definition of NIL 
could not have been achieved within NADA anyway. 

3.2 Central Requirements on NIL 

This section surveys the central requirements on NIL as they were identified 
within NADA. We deem the concepts essential for structured hardware descrip- 
tion and design; however, they are only incompletely realized in existing HDLs 
such as VHDL, CIRCAL, ELLA, LTS. The reader is invited to compare the 
requirements with the SLDL ones. 
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Scope NIL is a general purpose language supporting the description of different 
types of hardware at several levels of abstraction, in particular, the behavioural, 
the structural and the layout levels. 

Declarative Style NIL should have a clean declarative core which supports real- 
ization-independent behavioural specifications. Declarative hardware description 
abstracts from layout structure and computational process to functional be- 
haviour. 

Semantic Coherence NIL must provide a coherent semantic framework which, 
in the design, allows passing smoothly from specifications through component 
descriptions to concrete realizations. In particular. There have to be constructs 
for expressing non-functional aspects (resource bounds), among which are the 
allocation of computations both in space and time. The semantic model must 
coherently cover continuous and discrete notions of time as well as a notion of 
space. 

Structuring and Parameterization NIL must support the decomposition of 
complex systems into subsystems in all design phases. For this purpose, it has 
to offer flexible component descriptions with clear interface specifications. 

It must allow hierarchical design, large-scale parameterization and re-use of 
components. 

Design Correctness NIL must support establishing the correctness of a design 
by formal methods. It has to encourage a modular and stepwise design process, 
based on sound notions of refinement. 

In particular, the semantic model must allow concise transformation and 
verification rules that support deductive design and design verification. The re- 
finement and abstraction concepts comprise timing, structure and layout. NIL 
supports the analysis of designs by providing rules how to infer properties from 
descriptions. 

Extensibility NIL must be extensible so that users can tailor it to their partic- 
ular needs. Such language extensions may capture specialized types of hardware 
(synchronous circuits, combinational circuits, associative memory) or particular 
realization levels (register transfer level, CMOS level). 

The extension layers have to be explained in terms of the kernel language. In 
this way the language itself has a modular structure. This scheme has profitably 
and successfully been applied to the language CIP-L [1]. 

Tools and Graphics NIL has to be designed such that it interfaces with well- 
established tools such as VERILOG and VHDL simulators and compilers. In 
particular, NIL must designate operational sublayers for rapid prototyping, sim- 
ulation and compilation. NIL also offers a standard interface to graphical tools 
supporting the visualization and visual manipulation of designs. 
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Use in Engineering and Edncation To gain wide acceptance, NIL cannot 
involve sophisticated concepts. The semantics and the type system should be 
as simple as possible. For the engineer, NIL must combine manipulative fluency 
with intuitive comprehension. For education, it should serve as a conceptually 
well-formed and clean vehicle for teaching hardware design methods. Moreover, 
NIL should attempt to bridge the gap between software and hardware design. 



Let us conclude this section by another quotation from the SLDL document: 

The key activities of systems design include feasibility studies, concept selec- 
tion, product selection and architectural (high-level) design. System design 
typically includes the following capabilities (as described in the section below 
on SCOPE): 

— Several levels of abstraction that are essential to support feasibility stud- 
ies and concept selection. 

— Modularity in both functional and structural design to support product 
selection. 

— Verification, including analyses such as simulation modeling. 

— Provision for constraints and back annotation from downstream design 
stages. 

— Test planning to support architectural design. 



3.3 Survey of NIL Core Concepts 

Approach The approaches to formalization of hardware aspects used by NADA 
members are based on predicate logic and concepts from functional program- 
ming. Therefore it is natural that the core of NIL should essentially be a suitable 
fragment of logic and typed A-calculus. 

The core should stay close to standard algebraic and functional specification, 
i.e., use equational many-sorted logic, as long as possible. This will motivate the 
recommendations below on the various choice options. 

An important question is whether to use total or partial functions as opera- 
tors. Differing from the CoFI initiative [2], for simplicity NIL uses total functions 
because they lead to a much easier logical framework. Partial functions can be 
simulated through suitable totalization or by simulating relations by set-valued 
functions. 



Basic Concepts The basic entities involved in any component are data, time 
and space. 

Based on these, streams, also known as traces or waveforms, record the tem- 
poral succession of data at a given input or output port, whereas states record for 
each point in space and each time a data value. The computation by a module 
at a location in space is described by a stream transformer. 
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Data The data involved in the circuit level of hardware description mostly are 
of a very simple kind, such as Booleans or more refined logical values. At higher 
levels, particularly at the architecture level, also more involved data structures 
are needed. While equational axiomatizations using a generation principles are 
well-known, it may be advantageous to include certain standard data type con- 
structors such as sums or products to allow recursive data types in the form 
known from functional programming languages. 

All these data structures are discrete. It has to be determined whether con- 
tinuous data make sense and how they would be described. This aspect certainly 
is a core concept. 

Time NIL must include various concepts of time. Some options are 

— discrete vs. continuous time; 

— time with and without a starting point; 

— linearly vs partially ordered time. 

Discrete time usually is isomorphic to (an interval of) the natural numbers. 
It is easily axiomatized using an enumeration function or inductive definition. 

Continuous time cannot be enumerated. It is best axiomatized using an order 
relation. This suggests use of predicate symbols other than just equality. Non- 
discreteness is axiomatized e.g. by requiring 

\/x, y : X < y => 3z : X < z < y , 

which involves an existential quantifier. This can be avoided by introducing a 
Skolem function between : Time x Time Time with the axiom 

X < y => X < between{x,y) < y . 

Completeness is even worse, since it requires quantification over subsets to talk 
about infima/suprema. By the increasing importance of hybrid systems, however, 
continuous time cannot be neglected. For many hardware purposes discrete time 
will suffice, though. 

Linear time is adequate for modeling systems with a global time. For asyn- 
chronous systems it may be advantageous to use a partial ordering of time. 

Space Space serves to locate single subcomponents, which are entities with a 
well-defined extent. Hence the concept of continuous space is not relevant for 
NIL. Discrete space can be described via suitable enumeration functions or, in a 
more structured way, by inductive definition. 

The elements of the space domain will be called loeations. 

For the description of placement and topologies the space domain needs to be 
equipped — in increasing degree of precision — with a neighbourhood relation, 
coordinates or even a metric. 
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Timing Disciplines Once a notion of space has been fixed, we can also talk 
about global or local time. For global time, a single sort Time is sufficient. 
To describe local time, one either needs to view Time as a sort constructor 
parameterized by elements of Space, or to reflect locality by retimings between 
the subcomponents. 

For various retiming concepts see Lustre, Signal, Ruby as well as the SCA 
framework. 

Connectivity The neighbourhood relation between locations can also be given 
by a function Nbd : Space x Time Set(Space) that assigns to each point a 
neighbourhood, i.e., a set of other points. Depending on whether the underlying 
relation is symmetric, this models bi-directionality or uni-directionality. 

The neighbourhood relation may even be time-dependent. This way one can 
describe dynamic structures as used e.g. in on-the-fiy programming of FPGAs. 

Another issue about connectivity is the treatment of channels. Options here 
are 



— explicit naming/renaming/hiding of channels; 

— use of special constants for interconnection networks. 

Case studies have shown that both should be accommodated. Since they are 
inter-definable it is to be determined which option should go into the core. 

Streams and Waveforms The options here are total vs. partial streams. Total 
streams are simply functions from time to data. Partial streams stop after a 
certain time. Among these one finally can distinguish between complete and 
incomplete. Total streams fit in nicely with the decision to use total operations. 
Axiomatizations of partial streams are less direct in this setting. Possibilities 

are: 

1. use a dummy element and inflate each partial stream to a total one that 
constantly yields the dummy after a certain initial part; 

2. use higher-order concepts to distinguish the set of all (proper and improper) 
intervals of the time domain and have streams as functions from these inter- 
vals to data; 

3. use domain-theoretic notions to characterize total streams as limits of the 
partial ones. 

Of these, the first is the simplest. 

As soon as continuous time is used, streams should better be called wave- 
forms. For them, option 2 for describing partial waveforms becomes more in- 
teresting, since then quantification over subsets of the time domain are used 
anyway. 

Stream Transformers This is probably the easiest concept: a stream trans- 
former is a function from (tuples of) streams to (tuples of) streams. This is 
independent of all choices for the underlying domains of time, data and streams. 
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This notion suggests the use of higher types. As Moller [17, 18] and Meinke 
[11] have shown, this fits in well with equational logic, even if the axiom of 
extensionality is required to hold. Higher types would also easily accommodate 
a full typed A-calculus, which is convenient in many formulations. 

For the stream transformers further properties, such as monotonicity, conti- 
nuity or causality, may be required. 

Non-Determinacy There is quite a ramification in how NADA members model 
non-determinacy (if at all). The concepts used here are, in increasing semantic 
complexity, 

— sets of streams (traces); 

— sets of stream transformers; 

— CCS/CSP-like processes. 

Whereas the Amsterdam partners are using their well-established process alge- 
bra, essentially a first-order equational theory, the other two approaches heavily 
use higher-order concepts. 

Uses of non-determinism in hardware description are e.g. modeling the ac- 
cesses to a bus structure and handling of I/O events. 

State Certain components, such as sequential circuits, depend on an inter- 
nal state, whereas others, such as combinational circuits, do not. Depending on 
whether a component employs global or local time for its subcomponents, the 
state will be global or locally distributed. A global state can be modeled by 
a function State : Space x Time Data, where Time is the global time do- 
main. If streams are simply functions from Time to Data, one can, by currying, 
equivalently use State : Space Stream. 

Another way to model states is to use a stream of states in a feedback loop. 

In components with local time, local streams may be used to describe the local 
states. However, this would mean that Stream now needs to be a parameterized 
data type constructor, and the State function now needs to involve a dependent 
type: 

State : (x : Space) Stream{Time(x), Data(x)) . 

In components with global state, the stream transformer at each point in 
space may or may not depend on that state. 

Architecture The architecture of a component is described by its connectivity, 
the timing discipline (global or local), the global state function (if any), the 
stream transformer or process at each location and the parallel composition of 
all these. If the definition of parallel composition is taken general enough it also 
comprises sequential composition and feedback. 

Systems A system is the parallel composition of components. It will again 
involve a notion of space to locate the components, and a notion of connec- 
tivity. To allow hierarchical structuring, it would be most simple and coherent 
to view systems again as components. However, then parallel composition of 
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stream transformers, which model the computation in subcomponents, has to 
be somehow unified with the “parallel composition” of algebras, which model 
the components. 

In any case, there have to be concepts for expressing modularization, i.e., 
composition/decomposition of systems and hiding of internal structure; the se- 
mantics has to be compositional w.r.t. the correct implementation relation. 

Possible Extensions Here we first list a few concepts that have proved valuable 
in some NADA case studies, but entail a certain semantic complexity, so that 
it is unclear whether they should be part of overall NIL. These are, above all, 
parameterization of types, polymorphism and dependent types. While each of 
these topics is well-understood by itself, their combination leads to problems 
with type checking that have not yet been resolved. 

Another topic are formal descriptions of interfaces. While purely syntactic 
descriptions exist in many programming languages, in the overall line of NADA 
NIL would have to contain interface descriptions as “first-class citizens” that 
can be freely manipulated and formally reasoned about. Experiments have been 
performed to describe interfaces in the form of higher-order logic expressions, 
but no conclusive solution has been found. 

The second group of concepts that would enter, but for which no case studies 
have been performed within NADA, are non-functional aspects, such as speed 
and size of components. Examples of this are the clock calculus in Lustre/Signal 
(which is used also for determining whether a stream-based specification can 
have a realization in bounded memory at all) or Skillicorn’s cost calculus for 
parallel algorithms on lists. 

3.4 Reasoning 

One purpose of a formal system description is to allow precise reasoning about 
it. To this end, a precise formal semantics of the description language has to be 
given. A central notion is that of one component correctly implementing another. 

Concepts of Correctness There are well-studied implementation relations 
both for the algebraic and the logical/functional/relational views of systems. 
In the presence of non-determinacy, the implementation must allow behaviour 
refinement (reduction of non-determinacy and /or increase of “definedness” ) . Fur- 
ther essential concepts are abstraetion from layout /topology, interfaee refinement 
and data refinement. Moreover implementations via retimings have to be taken 
into account. 

Reasoning Tools Since one of our basic aims is to keep the underlying logic 
as simple as possible, also our reasoning tools should be somewhat restricted: 

— equational reasoning using the equational axioms or fixpoint properties of 
recursively defined entities; 

— in the case of refinement, inequational reasoning using the refinement relation 
and monotonicity; 
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— induction principles, viz. 

• structural induction and 

• fixpoint induction for recursive definitions (using both least and greatest 
fixpoint semantics); 

— uniqueness of the solution of certain fixpoint equations. 



Design Verification In verification, one constructs the implementation by 
some independent proceeding and afterwards tries to show that it correctly im- 
plements the specification. 

The methodical advantage of this is that one gives two views of a system; if 
they can be correctly related, this increases confidence in the overall formaliza- 
tion. 

Relevant techniques here include model checking and equational reasoning 
over algebraic specifications. 

Deductive Design In deductive design, one constructs the implementation 
from the specification by some systematic way. Current methods are stepwise 
transformation (refinement) or extraction of a program from a proof that the 
specification can be satisfied. 

Interplay Verification and deductive design may be combined fruitfully. For 
instance, one can develop a library of components using deductive design and 
use their initial specifications for the verification of larger systems. 

4 Survey of the Chapters of the Book 

The chapters of this book have been grouped according to the main research 
themes within NADA. Besides the language aspects dealt with in the present 
chapter, these themes were Mathematical Foundations, Hardware and Dynami- 
cal Systems and Verification and Deductive Design. 



4.1 Mathematical Foundations 

As is clear from the discussion in previous sections, streams and stream trans- 
formers are a central topic in formal hardware description. The first two chapters 
of this part of the book provide mathematical treatments of this subject. 

The chapter Streams, stream transformers and domain representations by J. 
Blanck, V. Stoltenberg-Hansen and J.V. Tucker presents a general theory for 
the computation of stream transformers of the form F : {R ^ B) ^ (T ^ A), 
where the domains T and R for time and A and B for data may be discrete or 
continuous. The authors show how methods for representing topological algebras 
by algebraic domains can be applied to transformations of continuous streams. 
A stream transformer is continuous in the compact-open topology on continuous 
streams if and only if it has a continuous lifting to a standard algebraic domain 
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representation of such streams. The chapter also examines the important prob- 
lem of representing discontinuous streams, such as signals T ^ A where time T 
is continuous and data A is discrete. 

In the chapter Ideal stream algebra, B. Moller provides some mathematical 
properties of behaviours of systems, where the individual elements of a behaviour 
are modeled by ideals of a suitable partial order. It is well-known that the asso- 
ciated ideal completion provides a simple way of constructing algebraic epos. An 
ideal can be viewed as a set of consistent finite or compact approximations of an 
object which itself may be infinite. A special case is the domain of streams where 
the finite approximations are the finite prefixes of a stream. The author defines 
a number of operators on ideals and behaviours and proves distributivity and 
monotonicity laws that are the basis for correct refinement of specifications into 
implementations. Various small examples illustrate that the operators lead to 
very concise while quite clear specifications. The chapter also gives a characteri- 
zation of safety and liveness and generalizes the Alpern/Schneider decomposition 
lemma to arbitrary domains. An extended example concerns the specification 
and transformational development of an asynchronous bounded queue. 

The final chapter of this part. Normalisation by evaluation by U. Berger, M. 
Eberl and H. Schwichtenberg, deals with a different area of the mathematical 
foundations of Hardware design. It extends normalization by evaluation from 
the pure typedA-calculus to general higher type term rewrite systems. This work 
also gives a theoretical explanation of the normalization algorithm implemented 
in the verification system MINLOG. 

4.2 Hardware and Dynamical Systems 

In Algebraie models of supersealar mieroproeessor implementations: a ease study, 
A.C.J. Fox and N.A. Harman extend a set of algebraic tools for microprocessor 
specification to model superscalar microprocessor implementations, and apply 
them to a case study. They develop existing correctness models to accommodate 
the more advanced timing relationships of superscalar processors, and consider 
formal verification. They illustrate their tools and techniques with an in-depth 
treatment of an example superscalar implementation. Clocks divide time into 
(not necessarily equal) segments, defined by the natural timing of the com- 
putational process of a device. The authors formally relate clocks by surjective, 
monotonic maps called retimings. In the case of superscalar microprocessors, the 
normal relationship between “architectural time” and “implementation time” is 
complicated by the fact that events that are distinct in time at the architectural 
level can occur simultaneously at the implementation level. 

The chapter Hierarehies of spatially extended systems and synehronous eon- 
eurrent algorithms by M.J. Poole, A.V. Holden and J.V. Tucker takes a very 
broad view of hardware, including even biological systems. First, the authors 
study the general idea of a spatially extended system (SFS) and argue that 
many mathematical models of systems in computing and natural science are ex- 
amples of SFSs. They examine the computability and the equational definability 
of SFSs and show that, in the discrete case, there is a natural sense in which 
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an SES is computable if, and only if, it is definable by equations. The authors 
look at a simple idea of hierarchical structure for SESs and, using respacings 
and retimings, define how one SES abstracts, approximates, or is implemented 
by another SES. Secondly, the authors study a special kind of SES called a syn- 
chronous concurrent algorithm (SC A). They define the simplest kind of SC A 
with a global clock and unit delay which are computable and equationally de- 
finable by primitive recursive equations over time. The authors focus on two 
examples of SCAs: a systolic array for convolution and a non-linear model of 
cardiac tissue. The chapter investigates the hierarchical structure of SCAs by 
applying the earlier general concepts for the hierarchical structure of SESs. Th 
authors apply the resulting SCA hierarchy to the formal analysis of both the im- 
plementation of a systolic array and the approximation of a biologically detailed 
model of cardiac tissue. 

In Towards an algebraic specification of the Java virtual machine K. Stephen- 
son develops an algebraic specification of the architecture of an abstract and 
simplified version of the Java Virtual Machine (JVM). This concentration on 
the implementation-independent features of the machine allows her to build a 
clean and easily comprehensible model in which its structure is emphasized. The 
author then axiomatizes the semantics of programs on this architecture. She 
also considers how one can concretize this abstract model which provides a firm 
foundation for exploring the entire JVM and thus of analyzing the correctness 
of Java implementations. 

In the next chapter, J.A. Bergstra and A. Ponse study Grid protocol specifi- 
cations. A grid protocol models concurrent computation, and consists of one or 
more modules repeatedly performing parallel I/O and computation. The authors 
provide several concise specification formats and correctness results on (external) 
I/O behaviour and illustrate their approach by examples. 

The aim of the chapter The computational description of analogue system be- 
haviour by P.T. Breuer, N. Martinez Madrid and C. Delgado Kloos is to define a 
simple analogue hardware description language L and give it a sound semantics 
that supports formal reasoning about its properties. The syntax of L is that of 
a hybrid programming languages but the semantics has been derived from the 
analogue signal semantics of the upcoming IEEE VHDL-AMS extension to the 
IEEE standard digital hardware description language, VHDL. L is here given 
two semantics. Firstly, what may be termed an exact, or hardware, semantics 
and, secondly, an approximation, or simulation, semantics. The simulation se- 
mantics is computable and the hardware semantics is not. The authors show that 
the simulation semantics approximates the hardware semantics in a well-defined 
sense. This property is a “no surprises’ guarantee with respect to simulation for 
the language. 

4.3 Verification and Dednctive Design 

This part starts with Reasoning about imperfect digital systems by K. Hanna. The 
basis of his chapter is the observation that in order to realize digital systems that 
operate at high speeds or that have very low power consumption, it is necessary 
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to work directly at the analog level of abstraction, that is, in terms of analog 
electronic components such as resistors and transistors. Although the external 
behaviour of such circuits can be described digitally, their internal operation 
can only be explained by working at the analog level and by taking account 
of both voltages and currents. This chapter describes how existing methods of 
specification and formal verification of digital systems can be extended so as to 
encompass such analog designs in a fully rigorous manner. 

The chapter by J. Philipps and P. Scholz is about Formal verification and 
hardware design with statecharts. Statecharts extend the concept of Mealy Ma- 
chines by parallel composition, hierarchy, and broadcast communication. While 
Statecharts in principle are widely accepted in industry, some semantical con- 
cepts, especially broadcasting, are still contested. In this contribution, the au- 
thors present a Statechart dialect that includes the basic concepts of the language 
and present a formal, relational semantics for it. They show that this seman- 
tics can be used for both formal verification by model checking and hardware 
synthesis. 

The chapter An exercise in conditional refinementhy K. Stplen and M. Fuchs 
is an attempt to demonstrate the potential of conditional refinement in step-wise 
system development. In particular, it emphasizes the ease with which conditional 
refinement allows boundedness constraints to be introduced in a specification 
based on unbounded resources. For example, a specification based on purely 
asynchronous communication can be conditionally refined into a specification 
using time-synchronous communication. The presentation is built around a small 
case-study: A step-wise design of a timed FIFO queue that is partly to be im- 
plemented in hardware and partly to be implemented in software. The authors 
first specify the external behaviour of the queue ignoring timing and synchro- 
nization. This overall specification is then restated in a time-synchronous setting 
and thereafter refined into a composite specification consisting of three subspec- 
ifications: A specification of a time-synchronous hardware queue, a specification 
of an asynchronous software queue, and a specification of an interface compo- 
nent managing the communication between the first two. The authors argue 
that the three overall specifications can be related by conditional refinement. By 
further steps of conditional refinement, additional boundedness constraints are 
introduced. The chapter explains how each step of conditional refinement can 
be formally verified in a compositional manner. 

The final chapter by Moller presents Deductive hardware design: a functional 
approach. As stated earlier, the goal of deductive design is the systematic con- 
struction of a system implementation starting from its behavioural specification 
according to formal, provably correct rules. The author uses Haskell to formulate 
a functional model of directional, synchronous and deterministic systems with 
discrete time. The associated algebraic laws are then employed in deductive 
hardware design of basic combinational and sequential circuits as well as a brief 
account of pipelining. With this also several of the IFIP WG 10.5 benchmark 
verification problems are tackled. Special emphasis is laid on parameterization 
and re-usability aspects. 
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Abstract. We present a general theory for the computation of stream 
transformers of the form F: {R ^ B) ^ {T ^ A), where time T and R, 
and data A and B, are discrete or continuous. We show how methods 
for representing topological algebras by algebraic domains can be ap- 
plied to transformations of continuous streams. A stream transformer is 
continuous in the compact-open topology on continuous streams if and 
only if it has a continuous lifting to a standard algebraic domain rep- 
resentation of such streams. We also examine the important problem of 
representing discontinuous streams, such as signals T ^ A, where time 
T is continuous and data A is discrete. 



1 Introduction 

1.1 Background 

Computing systems are implemented in physical systems, and physical systems 
are simulated by computing systems. Because of the importance of the applica- 
tions, there is a need for theoretically sound methods for modelling computations 
involving the interface between the continuous models of physical processes and 
the discrete models of algorithmic processes on digital computers. The theory 
of hardware design can reveal a number of common features between certain 
classes of computing and physical systems (see [36]). Perhaps the simplest com- 
mon feature is that computing and physical systems both process streams. 

A stream is simply a sequence 



of data at ^ A indexed by time t ^ T, i.e., a function from T to A. Time may be 
discrete, when typically it is modelled by the set bj of natural numbers; or time 
may be continuous, when typically it is modelled by the set M_|_ of non-negative 
real numbers. Time may also be modelled more abstractly. 

Most computing systems are designed to operate in discrete time. Their un- 
derlying algorithms and architectures are designed to process inhnite streams of 
data such as streams of bits and bytes. Numerous examples of such systems can 
be found among computers, application specihc chips, operating systems and 
networks. The theoretical development of systolic arrays, dataflow architectures 

B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 27-68, 1998. 

Springer- Verlag Berlin Heidelberg 1998 




28 



J. Blanck, V. Stoltenberg-Hansen, and J.V. Tucker 



and distributed parallel systems in computing have motivated a great deal of 
research on stream processing with discrete time. See the recent survey [41], for 
example. It should also be noted that discrete time streams are basic for com- 
puter modelling methods such as neural networks [1], cellular automata [59], and 
coupled map lattices [8,26,28]. 

Most physical systems operate in continuous time. Their mathematical mod- 
els can also be viewed as processing continuous streams. For example, partial 
differential equations specify a function that describes the behaviour of a system 
from an initial state under streams of input, such as wave forms, boundary condi- 
tions, etc. There has been relatively little research on computing with continuous 
streams though recently the subject has been studied as part of the design of 
hybrid systems (see e.g. [16]). 

In numerical modelling both discrete and continuous stream processing are 
fundamental. Solution techniques for solving differential equations, such as hnite 
difference and hnite element methods, are based on the discretisation of time and 
space. They involve methods for approximating continuous streams by discrete 
streams. Both discrete and continuous streams are needed in modelling hybrid 
systems for control and instrumentation. 

In the light of these observations the following problem is important: 

To create a computaMUty theory for stream processing over abstract 
data types that can be applied to continuous and discrete models of 
physical and computing systems. 



1.2 Domain representations of streams and stream transformers 

In this paper we present a unihed semantic treatment of discrete and continuous 
streams, and stream transformers. We will focus on problems concerning the 
continuity and computability of streams and stream transformers, and we will 
emphasise continuous streams. The framework is based on domain representation 
theory, which is a theory of representing topological algebras using domains [47]. 
The topological algebras are used to model data types, and the domains are used 
to model implementations of data types. 

Domain theory is an abstract theory of approximation of spaces and func- 
tions, aimed at isolating the structures underlying computation. A domain is 
an ordered set of approximations on which functions that are continuous in an 
order-theoretic sense are dehned. Of special importance is the fact that domain 
theory possesses elegant theories of (i) constructions of spaces of continuous 
functions; and (ii) effectively computable domains. At the heart of the subject 
is an intimate relation between computability and continuity. 

Domain representation theory allows domains to model concrete representa- 
tions of topological algebras. To a topological algebra A is associated an algebraic 
domain Da from which a subset is selected to make a representation of A 
via a continuous map n: A. On choosing a domain representation, prob- 

lems about the computability and continuity of functions on topological algebras 
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are translated to corresponding problems about domains (for which there is an 
excellent theory). 

Our unihed semantic framework for stream computation is as follows. Let R 
and T be topological algebras of time, A and B topological algebras of data, and 
{R — 7> B) and (T A) sets of streams. A stream transformer is a function 

F:{R^ B) {T ^ A). 

We will allow T and R to be both discrete and continuous models of time, and 
allow A and B to be discrete or continuous data types. There are 16 cases in total. 
Discrete time is identihed with the natural numbers bj or the integers A (with 
the discrete topology) and continuous time with the real numbers M_|_ or M (with 
the Euclidean topology) depending on whether or not there is a starting time. 
In particular, the framework can be applied to a complete range of computing 
applications, including analogue-digital transformations of the form 

B) ^ {m ^ A) and E: (N ^ 5) ^ (M ^ A). 

First, to each of the four algebras R, T, A and B we associate a domain 
representation Dr, Dr, Da and Dr respectively. The domains of continuous 
functions [Dr Dr] and [Dr Da\ naturally represent stream spaces [R 
B) and (T A). Most often we will consider spaces of continuous streams, 
denoted C(R — t B) and C(T A). We will also consider some spaces of non- 
continuous streams, in particular non-zeno signals (see Section 2.7). 

Next, for a stream transformer F we show how to construct a continuous 
function on the domains 



F: [Dr Dr] [Dr Da] 

representing F. For example, we will show (Corollary 6.9): 

A stream transformer F: C(M — ;> M) — ;> C(M M) is continuous with 
respect to the compact-open topology if and only if F has a continuous 
representation or lifting F:[TZ -P- TZ] — t [TZ ^ TZ], where 7Z is the 
standard interval domain representation of M. 

The study of properties of F , such as computability, is thus reduced to the study 
of properties of F . 

Despite the smooth theory of domains, this process is not straight-forward 
because of the relation: 



computability implies continuity. 

One problem we must deal with is that our unihed framework must cope with 
computing with non-continuous streams. 

For example, consider the sets of continuous and discrete data streams in 
continuous time: 



(M^ [0,1]) and (M^ {0,1}). 
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In the first case, many applications require models based on the subset of contin- 
uous streams (e.g., wave functions) . However, in the second case every non-trivial 
application involving discrete signals will need discontinuous functions (since the 
only continuous functions M {0, 1} are constant functions). Thus, to explicate 
computability, we must hud some reasonable notion of representation of non- 
continuous streams. In particular we consider non-zeno signals NZ(M + -t T) 
consisting of step functions, where T is a discrete space. We show that, for cer- 
tain classes of stream transformers, T:NZ(M_|_ B) ^ NZ(M_|_ A) has a 
continuous approximate representation F: [7?._|_ — t Bj_] \F+ ^i]- 

In this paper we will explain the above mathematical framework. In Section 
2 the basic ideas about streams and stream transformers will be dehned and sev- 
eral examples of transformations of time and data in streams will be presented 
to motivate the technical development. In Section 3 we summarise the essen- 
tial ideas about algebraic domains, function spaces and effective domains that 
we will use. In Section 4 we explain the method of domain representation for 
topological algebra, including the problem of approximate representations of dis- 
continuous functions. In Section 5 we consider domain representations of stream 
spaces. Finally, in Section 6 we show how domain representations of stream 
transformations for continuous streams and for non-zeno signals are chosen and 
applied. 

This paper is part of a series of articles on the theory of domain represen- 
tations of topological algebras [43-47]. We hrst considered discrete time stream 
computation in [46]. We thank our colleagues in the NADA Working Group for 
the several stimulating debates on the problems of stream computation starting 
with the debate led by Jan Bergstra at the hrst NADA meeting at Munich 1994 
and the NADA stream workshop 1995 hosted by Helmut Schwichtenberg and 
Hans Leiss at Elman. We also have benehtted from discussions on the subject 
with Jeff Zucker (McMaster), and Neal Harman and Matthew Poole (Swansea). 

2 Basic properties of streams and stream transformers 

2.1 General definitions 

There are several notions of streams and stream transformers in the literature. 
In this paper we will restrict ourselves to the simplest and, in our view, the most 
natural notion. 

Definition 2.1. Let T be a set of data modelling time and let A be any non- 
empty set of data. A stream is a total function ip:T -P- A. The set of all streams 
from T to A is called the complete stream space from T to A. A stream space 
from T to A is simply a subset of the complete stream space from T to A and 
is usually denoted {T -P- A). 

This definition is quite open as to what requirements one should put on time 
T. There are many philosophical, physical and mathematical models of time [58]. 
Time can be discrete or continuous. To model discrete time we choose the ordered 
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structure of natural numbers N, in the case time has an initial starting point, 
or Z, in case there is no starting point. For simplicity in the exposition we will 
usually model discrete time by bj. Thus a stream with discrete time is for us 
simply an inhnite sequence of data. 

To model continuous time we choose the ordered structure of non-negative 
real numbers M_|_, in the case time has an initial starting point, or M, in case 
there is no starting point. For simplicity we will usually model continuous time 
by M, except when an initial starting point is essential. A stream with continuous 
time is often called a signal. 

There are well-established notions of computability on bj and M and hence 
on our chosen models of discrete and continuous time. Classical models of com- 
putability on the natural numbers are well-known [9, 15, 40]. Computability mod- 
els on M have been developed since the 1950’s [7, 17, 29, 37]. 

Our data set A is in general simply a set or an algebra. In order to discuss 
computability of streams and stream transformers we need to have notions of 
computability on our data algebras as well. 

For discrete algebras we have the usual notion of computability induced by 
the computability on bj in the sense of computable algebra [12,30,38,47,49]. 

For uncountable algebras we need a notion of computability in terms of con- 
crete approximations. Topology can be seen as an abstract theory of approxima- 
tion where an open set approximates all its elements. Usually, but not always, 
our data type is a metric algebra. Of course, every discrete space can be given a 
discrete metric inducing the discrete topology. 

Assumption 2.2. All our data types are topological algebras. 

There are several approaches to computability on topological algebras in- 
cluding Type-2 computability by Weihrauch [57], recursive metric spaces by 
Moschovakis [34], and the approach chosen in this paper, domain representation. 
See [48] for a discussion of the relationship between these and other approaches. 

Having topologies on both time and data allows us to talk about continuity 
of streams. Note that each discrete time stream <^:bj — ;> A is continuous for any 
space A. Complete stream spaces with continuous time T and non-trivial data 
set A will contain non-continuous streams. When A is a continuous data set, such 
as M, it is natural to consider the stream space C(T A) of continuous streams. 
The stream space C(T — t A) has a natural topology, viz. the the compact-open 
topology. Note that there is no obvious or canonical topology for the complete 
stream space (T A) for continuous time T. 

Consider streams from continuous time into a discrete data set, i.e., signals. 
Then the only continuous streams are the constant streams. Non-continuous 
streams from continuous time to discrete data exist in abundance in models 
used in computing. Thus our theory of streams and stream transformations must 
accommodate them. In order to model non-continuous streams we use approxi- 
mate representations in the function space domains. In this way stream spaces 
containing non-continuous streams (more precisely quotients of such) obtain an 
induced topology from the representing domains. Thus we may, via the use of 
domains, speak about continuity of stream transformers also in this case. 
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We now turn to the notion of a stream transformer. 

Definition 2.3. Let {Ri Bi) and {Tj Aj) be stream spaces for 1 < i < m 
and 1 < j < n. A stream transformer is a functional F: 
ril<j<n(^i ^j)- 

Thus a stream transformer is a function which takes hnitely many streams 
as input and gives hnitely many streams as output. Usually, for the simplicity 
of the exposition, we will restrict ourselves to the case m = n = 1. This is not 
as restrictive as it hrst may appear. It is often the case that all input times Ri 
can be identihed with some common input time R and similarly for Tj and T. 
In this case we have 

{Ri Bi) — {R Bi X • • • X 5m), and 

1 < f < m 

{Tj Aj) = {T ^ Ai X ■■■ X An) 

l<j<n 

and we may consider 5 to be a stream transformer having one stream as input 
and giving one stream as output. 

The value F{ip){t) of a stream transformer F: {R ^ B) ^ {T ^ A) at time 
t may in general depend on the entire input stream cp. This is not reasonable for 
stream transformers modelling physical devices. Also from a computability point 
of view it is unreasonable to require inhnite information in order to compute a 
hnite object. 

Assume that time R has an initial point 0. Then a stream transformer 
F:{R B) ^ {T ^ A) is said to satisfy “causality”, or is “hnitely deter- 
mined”, if for each t ^ T there is an r G 5 such that the value of F{ip) at t is 
determined by ip restricted to the interval between 0 and r. 

It is clear that causality of a stream transformer F is intimately connected 
with the “continuity” of F . The latter will be dehned via the domain represen- 
tation of stream transformers. 

2.2 Some examples of streams and stream transformers 

First we consider some examples of streams. Then we give a few general forms of 
stream transformers in which the main idea is to separate time conversions from 
data conversions. We start with a simple model and then give some extensions. 
In each case the models will hrst be motivated by an example. 



Discrete time streams. Streams of the form /:bj — t A, where A is a non- 
empty set, occur throughout computing. Hardware systems are modelled using 
streams and stream transformers: at low levels of digital design, the data sets of 
bits and k-hit words. 



Bit = {0, 1} and Word^ = Bit*^, 
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are used to represent integers, addresses, flags, reals, pixels, etc. Devices operate 
in time using one or more discrete clocks, each represented by bj. There input- 
output behaviours are modelled by stream transformations of the form 

F: Bit]" ^ [N ^ Bit]™. 

For example, a thorough study of modelling a digital correlator is given in [19]. 
Other examples are in [18,22]. Discrete time streams and stream transformers 
are widely used in modelling software. One comprehensive approach, FOCUS, 
has been developed by Broy and his co-workers [6]. There also Unite partial 
streams are used. 



Signals. Streams of the form /: M — {0, 1} are used in digital signal processing 
and are drawn as square waves 



/ 



These streams do not exist in the physics of devices but are abstractions or 
specifications of certain streams of the form M — ;> M. Transitions in voltage (or 
current) are “modelled” by discontinuities. 




Fig. 1 . A digital signal (p is modelled by a square wave form /. 



A signal transition from low to high in / takes place at time t when the 
continuous wave Lp passes a threshold value r. Furthermore, the threshold value 
r is a physical quantity that is not known exactly but to within a range 



[r-e,r-f e]; 
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and the time t is not known exactly but within a range 

[t — (i, t + (i] . 

Any waveform Lp that enters the threshold range during the time range has 
the same square wave abstraction, see Figure 1. A device technology determines 
some e and (i such that it is technically impossible to distinguish two (reasonable, 
steep) wave forms that pass through the box. 



Single access stream transformers. Let us look at a very simple type of 
stream transformation. Suppose that a stream transformer simply transforms 
its input data at some point in time and outputs the result at some later time. 

Example 2.4- Consider G: (bJ — t M) — ;> (bj — t M), given by 

Notice that we are confronted by a boundary problem at t = 0, which is handled 
by giving an arbitrary value. So let us now suppose that t > 0. The hrst thing 
to do when given a stream and a time at which to compute the new stream is 
to calculate the time of interest in the input stream by a function r(t) = t — 1. 
Secondly, use the input stream p to get the input value at the appropriate time. 
Thirdly, calculate the output data from the input data by means of the function 
7r(n) = ‘In. 

The following dehnition gives us a way of constructing stream transformers 
of this form, which we call single access stream transformers. 

Definition 2.5. (i) Let the functional 

<P: {R ^ B) X {T ^ R) X {B ^ A) ^ {T ^ A), 

be defined by 

7r)(t) = 7r(^(r(t))). 

(ii) A stream transformer F\ {R ^ B) ^ {T ^ A) is of single access type if 
there exist a time tran,sformation r and a data tran,sformation ir such that 
F is given by 

F{p) = <P(ip,T, 7t). 

The G of Example 2.4 can now be expressed as 

G{ip) = <P{p,T, 7t), 

where r and tt are the transformations extracted in the example. 

By varying the functions r and tt we get a whole family of stream transform- 
ers. Of course, not all stream transformers are of this form. 

The important feature of the single access model is that it allows us to discuss 
time and data transformations independently. 
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Multiple access stream transformers. We model stream transformers that 
depend on the input data at several distinct times. 

Example 2.6. Consider the Fibonacci stream transformer G: (hJ — t hJ) — t (hJ — t 
Id), dehned by 

G{eW) = |^(^ _ 1 ) _ 2 )^ iU>2. 

Clearly, G does not ht into the single access model described above since it 
accesses the input stream at two different times. Thus, we have two time trans- 
formations, Ti and T 2 , and a data transformation tt taking two input data. The 
time and data transformations are 



n(t) = t - 1, 

T 2 (t) = t — 2, and 

7r(*, y) = xEy. 

Here is a generalisation of Dehnition 2.5 to a hnite number of accesses to the 
input stream. 

Definition 2.7. (i) Let the functional 

B) X (T ^ Rf X (B" ^ A) ^ (T ^ A), 

be defined by 

. . .,r„,7r)(t) = 7r(^(ri(t)), . . . , ^(r„(t))). 

(ii) A stream transformer F\ {R ^ B) ^ {T ^ A) is of multiple access type 
if there exist time tran,sformations Ti, . . . , r„ and a data tran,sformation ir 
such that F is given by 



F{p) = 0n{p,Ti,. . .,r„,7r). 



Note that <!> = <!>i for <P as in Definition 2.5. 

The G of Example 2.6 can now be expressed as 



G{(p) = 02{f,Tl,T2,TT), 



where Ti, T2 and tt are the transformations extracted in the example. 



Accessing an interval of the input stream. 

Example 2.8. Let G: (M — ;> M) — ;> (M — ;> M) be defined by 

= max /(*). 
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The stream transformer G depends on a continuum of values of the input stream 
so there is no possibility of modifying the single access model in the way done 
above for a hnite number of accesses to the input stream. 

If we generalise from the example above we could allow both endpoints of the 
interval to depend on t, making the data transformation to be of the following 
kind 

tt:{R ^ B) X R X R ^ A, 

where tt is dehned by 

7r(/, a,&)= max /(*). 

x^[a,b] 



Instead of max we could have any of a number of common operations, 



the integral 



T^{f,a,h)= f f{x)dx. 
Jb 



e-g-, 



Definition 2.9. (i) Let the functional 



'R: {R^ B) x{T ^ Rf x {{R B) x R^ ^ A) ^ {T ^ A) 
be defined by 

'^(/, n,'r2,7r)(t) = 7 t(/, ri(t),T2(t)). 

(ii) A stream transformer F: {R ^ B) ^ {T ^ A) is of interval access type 
if there exist time transformations Ti, T'j, and data transformation ir with 
the property that f = g on [a, h] implies 7 t(/, a, h) = Tr{g, a, h), such that F 
is given by 

F[f) = '^(/, n,'r2,7r). 

The stream transformer G of Example 2.8 can now be expressed as 

G{f) = '^(/, n,'r2,7r), 

where Ti, T'j and tt are the transformations extracted in the example. 



2.3 Time transformations 

In the models exhibited so far we have extracted time and data transformations 
as parts of stream transformers. The data transformations considered are arbi- 
trary functions on data. However, time transformations are of special interest 
and we will now look at some of the most fundamental time transformations. 
Constant delay is modelled by a time transformation of the form 

r(t) = t — d, 

where d is the constant delay. Hence a simple delay node is 



<Z>(^,r, id). 
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We can adjust for two clocks running at different speeds by a time transfor- 
mation of the form 

r(t) = kt, 



where k is the number of time units that passes on the input clock during 
one time unit on the output clock. Hence, using the single access functional 
of Dehnition 2.5, a stream transformer which outputs every other datum of a 
discrete stream is given by 

<Z>(^,r, id), 

where r(t) = 2t. 

If the input and output clocks run at the same speed but are out of phase, 
then this can be modelled by 



r(t) = t — m, 

where m is the time offset. The offset is positive if the output time is ahead of 
the input time. There is a difference between a delay and the offset considered 
here since a delay cannot be negative whereas the time offset may be negative. 

We have exhibited three different time transformations which are linear and 
can be used regardless of time being discrete or continuous. However, sometimes 
it is desirable to consider non-linear time transformations. 

Here is a useful general class of time transformations. 

Definition 2.10. A mapping r:T i? is a retiming if r(0) = 0 and r is 
surjective and monotonic with respect to the orderings on T and R. 

In the case that T and R are discrete clocks this notion of retiming is easy 
to understand and useful in both theoretical investigations and design exercises. 
It was introduced in a study of the design of digital correlators and further 
developed through applications to UARTs and micro processors, see [18,19,22, 
23,25]. 

r -|- 1 
I 




startr(r) 



starts (r -|- 1) 



Fig. 2. A retiming r. 
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A retiming t:T ^ R relates discrete clocks as in Figure 2. The figure illus- 
trates the set 



r ^ (r) = {t e T : starts (f*) < f < starts (r -1- 1)} 
where starts (r) = (least t)['r(t) = r]. 



2.4 State transformations 



Consider a server responding to requests. We assume that the requests appear 
at discrete times. The requests and the responses are easily seen to be streams, 
leaving us with the conclusion that the server should be modelled as a stream 
transformer. 

The server typically has an internal state which governs the responses to 
the requests. We consider a server that has an internal state and is working in 
discrete time, i.e., T = bj. 

Let / be a set of requests, O a set of responses and S a set of internal states. 
Suppose the server’s transitions are governed by the functions 

out : S X I ^ O, and 
next : S X I ^ S, 



which determine the output and next state from the input and current state. 

In responding to a stream of requests in discrete time, starting from some 
initial state, the dynamical behaviour of the server is governed by 

Estate'- S' X (T — 7" /) — 7" (T — 7" S') 



and 



^state(S) *) (f) 



s, if t = 0; 

next(^state(s, i){t - - 1)), if t > 0, 



^OUt- S X (T — 7" /) — 7" (T — 7" O) 



^OUt (^7 0 (0 



undehned, if t = 0; 

OUt(^state(s, *)(^ “ 1)) “ 1)), if f > 0. 



The typing of the functions testate and tPout does not fall within our dehnition 
of stream transformers since we have allowed them to take arguments other than 
streams as parameters. 



2.5 The analogue to digital converter 

The analogue to digital converter (AD-converter) is a device which samples an 
analogue electrical signal at discrete times and gives discrete approximations of 
the analogue signal at those times. 

We start by giving a very simple model of an idealistic AD-converter. 
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So , Si 




Fig. 3. Stream transformer with an internal state. 



Example 2.11. Combining the use of the floor function both as a time transfor- 
mation and as a data transformation in the single access model will give us a 
very simple model of an AD-converter. Let G: (M_|_ — ;> M) — ;> (bj — t Z) be defined 

by 

G(lp) = 'P{‘P,T, 7t), 

where t{x) = 7 r(*) = [x\. Then G is a simple AD-converter. 

We will now try to model a hardware AD-converter more faithfully. A hard- 
ware AD-converter will first bound the input signal to some closed interval. It 
will then sample the input stream at discrete points in time. More advanced 
converters will sample the input stream at several time points for every value it 
outputs, this is called oversampling. The output value is then calculated from 
the sampled values by some filtering algorithm. 

Note that we have not made any effort to model the anomalous behaviour 
that hardware can exhibit if the input is widely out of range. 

Example 2.12. We will model a 16-bit AD-converter with an output rate of 
10 kHz, an oversampling factor of 100, and which converts the interval OV to 
5V. 

The bounding step now amounts to bounding the input signal (stream) to 
the interval 0 V to 5 V. Let B: (M_|_ — ;> M) — ;> (M_|_ — ;> M) be defined by 

B{p) = ^(^,id, 7 t), 



where tt is given by 

Tr{x) = max(0, min(*, 5)). 

Clearly, B is a stream transformer bounding the input stream in the desired way. 

The sampling step can now be modelled similar to our trivial AD-converter 
example above. We model the sampling step by S: (M_|_ — ;> M) — ;> (bj — t Z) given 

by 

where r(t) = [ 10 ® • t J and tt[x) = [ ^ 5 ^ J • The constant 10 ® corresponds to the 
output frequency times the oversampling factor. The constant 2^® corresponds 
to 16 bit conversion. The constant 5 is the adjustment for the voltage interval. 
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The filtering step will become a multiple access stream transformer. We model 
the hltering by F\ (bj — t Z) — ;> (bj — t Z) given by 

T(^) = ^ioo(^, t-q, • • • , T 99 , 7t), 

where Ti[t) = lOOt + i and 7r(*o, • • • , *99) is the hltering function. The hltering 
function will typically remove values that are far from the average value and 
then compute the average of the remaining values. 

Our hardware AD-converter can now simply be modelled by composing the 
bounding, sampling and hltering stream tranformers given above. Thus we dehne 
our AD-converter AD\ (M_|_ — ;> M) — ;> (bJ — ;> Z) by 

AD = FoSoB. 



2.6 DA-conversion or curve fitting 

A digital signal Lp can be seen as a function from bj to some discrete set (usually 
a hnite set). We assume that the values of the digital signal form a hnite subset 
of M. We also assume that the discrete time is embedded into continuous time 
by an increasing function r. Hence every pair (r(n), p{n)) is a point in the space 
M_|_ X M. To convert a digital signal to an analogue signal we need to extend 
the enumerable set of points in M_|_ x M to a function in (M_|_ M). Figure 4 

indicates a few alternatives of how a digital signal Lp may be converted into a 
continuous signal. The hrst is to extend the point set to a step function (this is 
the normal description of a hardware DA-converter), the second makes a linear 
approximation and the third uses polynomials of degree three. 

It is easy to describe the DA-converting stream transformers above. This is 
done for the linear case in the example below. 



Fxample 2.13. The following stream transformer converts a digital signal to a 
linear analogue signal. Let G: (bJ — Z) — (M — M) be given by 



G{p){t) = 



(tp{0), ift<l; 

\ - IJ) -b (t - [tj)(^([tj) -p{[t- IJ)), otherwise. 



It is easy to see that there are two time transformations in operation here, 
namely, Ti{t) = [t — IJ and T'jit) = [tj. However, the data transformation does 
not depend only on the input data at both of these times, but also on the time t. 
Hence this stream transformer does not fall into any of the previously discussed 
general forms of stream transformers. 



2.7 Signal transformations 

A .signal is a stream with continuous time and discrete data. Normally, when 
considering signals, we consider time with a starting point, i.e. time is M_|_. Thus 
a signal is a function A where A is a discrete data set. A signal trans- 

former is a stream transformer taking signals to signals. Signal transformers are 
sometimes called signal operators. 
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Fig. 4. DA-conversion of a digital signal ip. 



Signals and operators on signals have been much studied in the literature 
from a computer science point of view, in particular in connection with hybrid 
systems [16]. Automata over continuous time is considered in [51,39]. 

As mentioned earlier, the only continuous functions from M_|_ into a discrete 
data set A are the constant functions. Thus the class of continuous signals is no 
more interesting than the data set A. On the other hand it should be clear, and 
will be apparent later, that the class of all signals is too wide when considering 
signal operators. For example, the signal taking the value 0 at rational time 
points and the value 1 at irrational time points is too irregular to be distinguished 
by a reasonable operator. 

From the discussion of signals in Section 2.2 we see that in order to model 
digital systems it suffices to consider the following subclass of signals. 

Definition 2.14. Let A be a discrete countable set. A non-zeno signal over A 
is a stream M_|_ A such that ip is continuous at 0 and for each t > 0, <,£> has 
only hnitely many discontinuities in [0,t]. 

Thus a signal is a step function with hnitely many jumps on [0,t]. 

We now consider single and multiple access signal operators. We need to 
slightly strengthen the notion of retiming (Dehnition 2.10). 

Definition 2.15. A mapping r:M M_|_ is a strict retiming if 

(i) r is surjective and monotonic; 
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(ii) r(0) = 0; and 

(iii) r(t) < r(t') whenever Init(r) < t < t' , where Init(r) = sup{t £ M_|_: r(t) = 
0 }. 

Note that a strict retiming is continuous. 

Theorem 2.16. Let F: (M_|_ B) ^ (M_|_ A) be a single access signal oper- 
ator with repeat to ir: B ^ A and a strict retiming r:M_|_ M_|_, Then F takes 

non-zeno signals to non-zeno signals. 

Proof. Let — ;> 5 be non-zeno. Then 

F(p)(t) = TT(p(T(t))). 

Assume that p is discontinuous at to < < • • • • Then pr is discontinuous 

at precisely < ... , and hence the set of discontinuities of 

Ffp) is contained in {'r“^(to), . . .}. Now, to > 0 since p is non-zeno, so 
> 0, i.e. Ffp) is continuous at 0. Similarly, by the monotonicity of r, 
< ... is unbounded in M_|_, unless hnite. □ 

The theorem clearly extends to multiple access signal operators. 

3 Domains 

In this section we will briefly review some basic and relevant parts of domain 
theory. We concentrate on giving the notions and some results that are needed for 
our analysis. All proofs are omitted and can be found in the basic reference [42]. 

3.1 Preliminaries on domains 

Let D = [D, C) be a partial order and let A C D. We will use the notation fA 
to denote the set {y ^ D : 3x ^ A{x C y)}. The set is abbreviated by fx. 
We dehne fA and fx dually. A is directed if A 0 and whenever x,y ^ A then 
there is z G A such that x F z and y F z. The supremum, or least upper bound, 
of A (if it exists) is denoted by [J A. As usual we write xUy instead of 1J{*, y}. 

A complete partial order, abbreviated cpo, is a partial order, D = (D; C, A), 
such that A is the least element in D and any directed set A C D has a supremum 
y A in A>. 

Let H be a cpo. Then an element a ^ D is compact if whenever A C D is 
a directed set and a C |J A, then a G fA. The set of compact elements of D is 
denoted by Dc. 

A cpo D is algebraic if for each x ^ D, the set 

approx(*) = \,x F Dc 

is directed and x = |J approx(*). A cpo D is consistently complete if [J A exists 
in D whenever A C D is a consistent set, i.e., has an upper bound. 

The domains we consider in this paper are of the following kind. 
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Definition 3.1. A Scott-Ershov domain, or simply domain, is a consistently 
complete algebraic cpo. 

The topology normally used on domains is called the Scott topology. Let D 
be an algebraic cpo. Then 17 C D is open if 

(i) X ^ U and x \Z y implies y ^ U , and 

(ii) X ^ U implies that there exists a G approx(*) such that a ^ U . 

An easy observation is that the Scott topology on a non-trivial domain is Tq 
but not Ti. Furthermore, the sets '[a for a Dc constitute a base for the Scott 
topology on D. We will also write Ba for 

Let D and E be cpo’s. A function f:D^Eis continuous with respect to 
the Scott topology if, and only if, / is monotone and 



for any directed set A C D. 

Any continuous function between domains is determined by its values on the 
compact elements. In fact, let D and E be domains. Then a monotone function 
f'.Dc^E has a unique extension to a continuous function g: D ^ E such that 
/ = 

A conditional upper semi-lattice with least element, abbreviated cusl, is a 
partially ordered set where each hnite bounded set has a least upper bound. 
The set of compact elements Dc of a domain D forms a cusl. Every domain is 
obtained as a completion of a cusl in the following way. 

Definition 3.2. Let P be a cusl. Then / C P is an ideal if 

(ii) a ^ I and h E a implies h ^ I, and 

(iii) a,h ^ I implies aUh exists and a U & G /. 

For a ^ P we let [a] denote the principal ideal generated by a. The ideal 
completion of a cusl P is the set of all ideals over P, denoted Idl(T'). When 
ordered by set inclusion the ideal completion of a cusl is a domain. The compact 
elements of Idl(T') are the principal ideals [a], for a ^ P. 

The representation theorem for Scott-Ershov domains tells us that any do- 
main is the ideal completion of a cusl. 

Theorem 3.3. Let D be a Scott-Ershov domain. Then Idl(Dc) = D. 

We clearly have the following equivalence, for I G Idl(T') 

[a] C I <^=y a E E 

Thus the basic open sets of Idl(T') in the Scott topology are of the form B„ = 
{/ e ldl{P) : a e /} for a e P. 
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The class of domains has pleasing closure properties. In particular, what is 
essential for our purposes, the category of domains is cartesian closed. Most 
importantly, this means that for any domains D and E, the function space 



[D ^ E] = {/: D ^ E \ f is continuous} 



is a domain, where the ordering C on [D — E\ is given by / C ^ 4=y (V* G 
D){f{x) C g{x)). 

We recall some basic facts, and notations, of the compact elements in [D 
E\. For a Dc and h ^ Ec ve consider the “step function” {a]h):D^E dehned 

by 



{a;b){x) 



b, if a C X] 
T, otherwise. 



Then (a;&) is continuous and compact. Furthermore, for any / G [D — if] we 
have 



(a; 6) C / b C /(a). 



The compact elements of [D E] are those of the form 



n 

I I 

i = l 

where {(a,-; 6,) : i = 1, . . . , nj is consistent (i.e. bounded) in [D ^ E], And the 
latter holds if, and only if, for each I C {1, . . . , nj. 



{oi : i ^ 1} consistent =y i G /} consistent. 



Recall that consistent completeness is used to prove the above properties. In 
fact, the class of algebraic cpo’s is not closed under the function space construc- 
tion. 



3.2 Effective domains 

In this section we briefly recall some basic notions of effectivity or computability 
on domains. We start by recalling some general notions. 

A structure A is a tuple A = (A; Ri, . . . , Rp](7i, . . . , (7q), where A is a non- 
empty set, Rj C A"j is an rij-ary relation and crp A"' — ;> A is an rij-ary operation 
on A. A numbering of a structure A is a surjective function a: A, where 

C uj. Let =„ denote the equivalence relation dehned on by 

m =a n 4=y a(m) = a{n). 

Definition 3.4. Let a: — ;> A be a numbering of the structure A = (A; Ri, 

Rp] CTi, , cTq). Then a is a weakly effective numbering of A and the pair 
(A, a) is a weakly effective .structure if (i) and (ii) below hold. 
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(i) For each i = there is an rij-ary partial recursive function (f,- such 

that for each nii, . . .rrin^ G (7i{mi, . . . , rrinji (where ], means dehned) 
and 

ai{a{mi), . . 

(ii) For each j = I, ... ,p there is an rij-ary recursive relation Rj such that for 
each mi, 

Rj{a{mi),...,a{m„^)) 4=^ Rj{mi, . . . 

We say that (fj- and Rj track di and Rj respectively. 

Definition 3.5. A weakly effective numbering a of a structure A is computable 
if is a recursive set and =„ is a recursive relation. If a is computable then 
the pair {A, a) is a computable .structure. 

Let {A, a) and (B,l3) be weakly effective structures. A function f:A^B 
is [a, [3) -computable if there is a partial recursive function / such that f2a Q 
dom(/) and for each m G i?Q,, = /3(/(m)). 

A partial function g: A ^ B is {a, l3)- computable if there is a partial recursive 
function g such that dom(^ ° «) C dom(^) and g{a{m)) = j3[g[m)), for m G 
dom(^ o a). 

A set C C A is a-dectdable [a-semtdectdable) if there is a recursive (r.e.) set 
W such that a~^[C\ = W 0 fia- A recursive (r.e.) index for W, in the usual 
sense of recursion theory, is called a recursive (r.e.) a-index of C . 

When regarding computability on a cusl we are often not only interested in 
having a decidable ordering but also having a decidable consistency relation and 
the ability to compute suprema of hnite consistent sets. Therefore we consider 
a cusl to be a structure of the form 

P = [P] C, Cons, U, T). 

Definition 3.6. Let C be a cusl. Then [P, a) is a computable cusl if a is a 
computable numbering of the structure P = (C; C, Cons, U, T). A domain D 
is an effective domain if there is a such that (Dc,a) is a computable cusl. We 
denote this effective domain by [D, a). 

It is clear from earlier remarks about the function space that if D and E are 
effective domains then so is [D E] with a numbering obtained uniformly from 
the numberings of D and E. 

We now extend computability from the cusl of computable elements to the 
whole domain. 

Definition 3.7. Let (D,a) be an effective domain. Then x ^ D is a-comput- 
able if approx(*) is a-semidecidable. An r.e. index of the set a“^[approx(*)] is 
an a-index of the computable element x. 

The prehx a will be dropped when the numbering is clear from the context. 

Let Dk = {x ^ D : X is computable}. Note that Dc C D^. 
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Definition 3.8. Let [D, a) and {E, [3) be effective domains. A continuous func- 
tion f'.D^Eis {a, l3)-effecUve if the relation R C Dc x Ec defined by 

R{a, h) 4=y h C f{a) 

is (a, /3)-semidecidable, that is the relation 

R{m,n) 4=y R(^a{m) , a{n)) 

is r.e. An r.e. index for R is an effective index for f with respect to a and /3. 

Lemma 3.9. Let [D,a), {E , [3) and {E,'j) be effective domains and let f:D^ 
E and g: E ^ E be continuous and [a, (3) -effective and [[3, ■^()- effective respec- 
tively. 

(i) If X ^ D IS a-computable then f{x) G E is fl- computable. 

(ii) The composition h = g o f is {a, 'j)-effective. 

We observe that the standard proof, see [42], is uniform. That is, we can 
uniformly obtain an index for f{x) from indices for / and x. Similarly an index 
for h is obtained uniformly from indices of / and g. 

We can extend computability via a numbering from Dc to the computable 
elements Dk in the following way. 

Theorem 3.10. Let (D,a) be an effective domain. Then there is a numbering 
a:uj ^ Dk such that 

(i) the inclusion mapping l: Dc — t Dk is {a, a) -computable and, 

(ii) the relation R{n,m) 4=y a{n) C d(m) is r.e., i.e., approx(d(m)) is 
a-semidecidable uniformly in m. 

It can be shown that the numbering a is the unique one satisfying (i) and 
(ii), up to recursive equivalence, see [42]. This numbering is the “correct” one 
since it identifies effectiveness and computability for functions. 

Theorem 3.11. Let (D,a) and {E , [3) be effective domains. Then a continuous 
function f'.D^E is [a, (3) -effective if and only if flok'-^k Ek is (d,/3)- 
computable. 

In fact, the theorem in its stronger form says that any (d, /3)-computable 
function /: Dk — t Ek extends to a continuous (a, /3)-effective f:D^E. 

4 Domain representability 

Our overall aim is to model streams and stream transformers using domains 
in a uniform way covering both discrete and continuous time and discrete and 
continuous data. The method we choose is domain representation in which time 
T and data A are represented by domains Dt and Da, and a stream space 
is represented by the domain [Dt — t Da]- First we recall the basic notions of 
domain representability and then we describe standard domain representations of 
metric spaces. Finally, in order to model non-continuous streams, we generalise 
the notion of domain representation to approximate representations. This is 
related to domain representations of structures with relations. 
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4.1 B asic definitions 

We briefly recall some basic definitions and facts about domain representability. 
For motivation and details we refer to [47]. 

Let df and Y be topological spaces. Recall that function i>: X ^ Y is a 
quotient mapping if U C Y is open if, and only if, is open in df. In case n 

is surjective we then have that df/~ and Y are homeomorphic spaces when the 
former is given the quotient topology and where ~ is the equivalence relation 
induced on df by n, i.e., * ~ t/ 4=y I'ix) = v{y). Here is the basic dehnition. 

Definition 4.1. Let X he a topological space, let D be a domain and a 
subset of D. Then (D,D^,v) is a domain representation of X if X 

is a surjective quotient map when is given the (relativised) Scott topology. 
In case (D,a) is an effective domain then [D,D^,n,a) is an ejfective domain 
representation of df. 

Thus a domain representation (D,D^,v) of X contains both concrete and 
proper approximations of elements of X , the compact elements in Dc, and “total” 
elements in containing sufhcient information to represent elements of X 
exactly via n. Since the function n in the dehnition above is a quotient map we 
have 

Definition 4.2. A domain representation [D, , n) of a space X is 

(i) upwards closed if whenever x G and x Q y then y G and i'{x) = 

(ii) dense if for each a E Dc, '[a Cl ^ 0; 

(iii) local if (V*, y E D^)(v(x) = v(y) => x and y are consistent). 

Upwards closed domain representations (D, , v) are natural when regard- 

ing the ordering C on D as an information ordering. If x E completely 
determines I'ix) E X and x \Z y then y contains all the information of x and 
hence also completely determines v{x). All domain representations considered 
in this paper are upwards closed. 

The usual way to construct a domain representation of a space A is to con- 
sider an approximation structure of concrete approximations of A. An approx- 
imation structure most often takes the form of a cusl P = (Y; C, T). Then the 
representing domain is the ideal completion Idl(T') of Y. A space A will have 
many domain representations. It is up to the “user” to choose a representation 
appropriate for his or her purposes. We refer to [47] for a discussion of approxi- 
mation structures. 

The next step is to represent continuous functions between topological spaces. 

Definition 4.3. Let (D,D^,v) and (E,E^,p) be domain representations of 
A and Y respectively. A function /: A — ;> Y is represented by (or lifts to) a 
continuous function f-.D^Eif f{D^) C E^ and p{J{x)) = f{v{x)), for all 
X E D^. 
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This means that the following diagram commutes. 

/ 

D E 



t 

dr 

4 






f\o^ 



t 



f 






Let (D,D^,v) and [E,E^,ii) be domain representations of X and Y re- 
spectively. Suppose f:D ^ E is such that f[D^] C E^ and such that r{x) = 
v{y) => = n[f[y)). Then / induces a unique function f:X ^ Y 

dehned by f{i>{x)) = //(/(*)). In the terminology above, / is represented by /. 

Proposition 4.4. If f: X Y is represented by a continuous f:D^E then 
f IS continuous. 

The proof is simple and depends on the fact that n is assumed to be a quotient 
(only continuity of p is required). 

Recall that Dk is the set of computable elements of the effective domain 
[D,a). Suppose the topological space X is represented by (D, , n, a). Then 

the set Xk of [D, , n, a)-computable elements of X is the set 

Xk = {xeX ■. v-\x)C\Dk 7 ^ 0 }. 

Let a be the canonical numbering of Dk obtained from a as in Theorem 3.10 
and let i? = {n e w : v{n) £ D^}. Dehne a: f] ^ Xk by 

a(n) = i>{a{n)). 

The numbering a is the canonical numbering of Xk obtained from the domain 
representation [D, , n, a). 

Now suppose [E , E^ , p, /3) is a domain representation of Y and suppose 
f: X ^ Y has an (a, l3)-effective representation f:D^E. Then we say that / is 
(a, py effective. It follows from the results for effective domains that f[Xk] C Yk 
and f\xk '■ Yk is (d, /3)-computable. Sufficient conditions for when an [a, [3)- 

computable function g: Xk — t Yk can be extended to an [a, /3)-effective function 
f'.X^Yis established in [2, Theorem 2.27]. These conditions are met by the 
real numbers M, as well as by many other recursive metric spaces. 

We see that an effective domain representation of a topological space N 
induces effectivity on N. Thus the effectivity of N in the sense described here is 
dependent on the effective domain representation chosen. It is shown in [48] that 
other notions of effectivity considered in the literature for topological spaces and 
algebras, such as the algebra of real numbers M, are obtainable from effective 
domain representations, showing that the method of domain representation is 
not only flexible and general but also has sufficient strength. 

For a discrete space N we have a domain representation (Nj_,N, id). Here 
is another representation providing more information in the sense of having a 
richer set of approximations. 
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Example 4.5. Let X be a discrete topological space and let E = {df} UVf{X) \ 
{0} be the domain of finite subsets of X ordered by reverse inclusion. Let E^ 
consist of all singleton sets in E . Then [E , E^, i^) is a domain representation of 
df where = x, in fact, 

E^ = X. 

We will denote the domain E by Vj{X). 

4.2 Standard representations of metric spaces 

We will briefly describe how to construct a domain representation of a metric 
space df. The domain representation constructed here will be referred to as a 
standard domain representation. 

Definition 4.6. Let X he a metric space and let P be a family of non-empty 
closed subsets of X , containing df but not 0. Then P is a closed neighbourhood 
system for df if the following are satished: 

(i) if E, E' ^ P and E O E' ^ ^ then E O E' ^ P , and 

(ii) if X ^ U , where U is open, then {3P ^ P){x ^ P° A ECU). 

Here E° denotes the interior of E . 

A closed neighbourhood system H is a cusl when ordered by reverse inclusion. 
Clearly each metric space has a closed neighbourhood system since metric spaces 
are regular. 

Let A be a metric space with metric d and choose a closed neighbourhood 
system P for A. Let D be the ideal completion of the cusl (H; C, A), where C 
is reverse inclusion. For A G H we let 

diam(A) = sup d[x,y). 

x,yeF 

Definition 4.7. An ideal / G D is converging if 

(Ve > 0)(3A e 7)(diam(A) < e). 

It is easy to see that the intersection of a converging ideal is a singleton set. 
Let be the set of converging ideals and dehne v. A by 

v{T) = X 4=y L = {*}. 

Theorem 4.8. Let X be a metric space. Then (D,D^,v) constructed as above 
IS a domain representation of X , which is upwards closed, dense and local. 

We say that [D, , v) is a standard domain representation of A. 

Theorem 4.9. Let (D,D^,v) and [E,E^,p) be standard domain representa- 
tions of metric spaces X and Y respectively. Then every continuous function 
f: X ^ Y can be represented by a continuous function f:D^E. 
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The proof of the theorems above can be found in [2]. 

For the real numbers M we choose in this paper the closed neighbourhood 
system 

P = {[a, h] : a < h and a,b G Q} U {M} 

and denote its ideal completion Idl(_P) by TZ. Clearly, C is a computable cusl 
and hence TZ is an effective domain. Thus TZ = {TZ, TZ^, v) is an effective domain 
representation of M. It is shown in [47] that the computability on M induced by 
TZ coincides with the notions of computability usually considered in recursive 
analysis (e.g. in [37]). 

It is easy to see that for an irrational point * G M there exists only one 
converging ideal, namely the ideal = {[a,&] : a < x < h}. However, for a 
rational point r G M there exist four different converging ideals representing r. 
The existence of several ideals representing the same point in M is necessary for 
topological reasons. The converging ideals representing a rational point r are: 

Ir = {[a, h] : a < r < h}, 

= {[a, b] : a < r <b}, 

I~ = {[a, b] : a < r < b}, and 
F = {[a, b] : a < r < b}. 

These are ordered as indicated in the following diagram. 




Effective representations exist for a large class of metric spaces. However, 
often a more intricate construction than the one above is needed; see [2]. 

A discrete topological space A can be given a discrete metric. Then (Aj_, X, 
id) is (isomorphic to) a standard representation of A in the sense above. Simi- 
larly, the representation in Example 4.5 is a standard representation of A. 

4.3 Representing relations and non-continnons fnnctions 

Streams need not be continuous. Thus, in order to represent such a stream space 
by a function space domain, we need a way to represent a non-continuous func- 
tion by a continuous function between domains. By Proposition 4.4 we know that 
this is impossible. We have to settle for representing non-continuous functions 
approximately. 

An analogous problem is how to represent a relation on a space. A canonical 
example is the space of real numbers M. How do we represent the < relation? 
The problem is that relations in terms of their characteristic functions are often 
not continuous. 
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Let X be a topological space and let B = {true, false} be the discrete boolean 
space. An n-ary relation P on X can be identified with its characteristic function 
cp: X — ;> B dehned by 



f true, if P{ai, . . . , a„) 

{ false, if - 1 ^( 01 , . . . , a„) 



The idea is to represent the possibly non-continuous characteristic function con- 
tinuously in such a way that it gives exact values at points of continuity and 
only proper approximations at points of discontinuity. We know that this is the 
best possible. 

Let [D, , n) be a domain representation of A and let P be an n-ary relation 

on A. Dehne cp : Df Bj_ by 

( true, if (V* E (D^Y)(a C x =y P(v(x)))\ 
cp{a) = < false, if (V* E {D^)'^){a C x =y ^P{v{x)))\ 
y T, otherwise. 

Cp is clearly monotone and hence extends uniquely to a continuous function 



cp-.D^ ^B_l. 



We say that cp represents cp or P approximately. 

Example 4-10. Consider the standard interval representation TZ of the reals and 
the relation <. Then 

{ true, if & < c; 
false, if d < a; 

T, otherwise. 

Note that c< is effective. If * < j/ in M then c< (I^, ly) = true and if j/ < * in M 
then c<^[Ix, ly) = false. In case x = y then c<^[Ix, Ix) = -L. 

The function c< : — ;> B is continuous on 

{{x,y) : x ^y} CR^, 

and discontinuous on the diagonal. Thus c< represents c< exactly on points of 
continuity. At points of discontinuity c< only provides the trivial approximation 
of the value of c< . 

It is well-known that < is not decidable or even semidecidable on the recursive 
reals M^, the problem being that equality on is not semidecidable. (Equality 
is cosemidecidable, i.e., is semidecidable.) This is reflected by the discontinuity 
of c< . 

We now want to generalise the continuous representation of relations to con- 
tinuous representations of non-continuous functions. The idea is the same. We 
want our representation to be exact at points of continuity and as good as possi- 
ble in terms of approximations at points of discontinuity. Here is the dehnition. 
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Definition 4.11. Let [D, , i^x) and (E,E^,vy) be domain representations 

of the topological spaces df and Y, respectively. Then a function f:X ^ Y 
(not necessarily continuous) is said to be represented approximately by (or lifts 
approximately to) f:D^Eif 

(i) / is continuous, 

(ii) (V* G D^)(f continuous at J^x(®) /(*) ^ and fi'x(x) = /(*)), 

and 

(iii) (V* G D^)[f not continuous at I'xix) => (3j/ G (*)])(/(*) C 

y))- 

The following example illustrates the notion above. 

Example 4.12. The floor function [-JiM — ;> Z is discontinuous at precisely the 
integer points. We shall represent the floor function by a continuous domain 
function in the above precise sense. 

Let TZ be the standard interval domain representation of M. We could choose 
Zj_ as a representation of E and proceed as in the case of representing relations. 
However we can do much better if we choose a domain representation with more 
care. Let [Vf{E),Vf{E)^,p) be the domain representation of Z described in 
Example 4.5. Define f:TZc — t Vf{E) by 

f([a,b]) = {mEE: [aj < m < [&J } 

and /(M) = E (i.e., / is strict). Clearly, / is monotone and hence extends 
uniquely to a continuous function /:7?. — t Vj{E). 

Let * G M and let G TZ^ be the smallest ideal representing *. If * is not 
an integer then f{Ix) = {[*)}• Thus / represents the floor function exactly for 
all points of continuity. Now consider an integer m. Recall the four different 
representations of m described in Section 4.2. It is easily seen that 

film) = {m - 1, m}, 
film) = {m- l,m}, 
film) = and 

fin = {m}. 

It follows that / represents the floor function approximately in the sense of 
Definition 4.11. However, thanks to our choice of representation for E we are 
able to obtain much information also at points of discontinuity. This illustrates 
the importance of choosing appropriate representations of the data types. Had 
we chosen Zj_ to represent E then the representation of the floor function would 
provide no information at points of discontinuity. 

We close this section by showing that under rather general conditions, satis- 
fied by the representations considered in this paper, there is a best continuous 
approximate representation of an arbitrary function, if there is one at all. 
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Theorem 4.13. Let [D, , vx) ond [E , E^, vy) be domain representations of 

X and Y, respectively. Assume that is dense in D, and that (E,E^,vy) is 
upwards closed and local. Let f: X ^ Y be a function and assume that f has 
one approximate representation in [D ^ E]. Then there is a best approximate 
representation f ^ [D ^ E] in the sense of the domain ordering. 

It should be remarked that in the cases of streams that we consider we always 
have an approximate representation (see Theorem 5 . 2 ) and hence a best approx- 
imate representation. However, this representation is best only in the sense of 
the domain ordering. It may not be the best in the sense of computability, i.e. 
there may be a computable representation even though the best representation 
is not computable. Of course, the latter only affects the values at points of dis- 
continuity. 

Proof. Let Af = {/£ [D ^ E]: f represents / approximately}. We show that 
. 4 / is directed. Af 0 by assumption. Suppose fi, f 2 ^ Af . We hrst show that 
fi and /2 are consistent. For this it suffices by the density of to show that 
fi{x) and f2{x) are consistent for each x G D^. 

Fix X G and assume / is continuous at iyx{x). Thus fi{x) , f2{x) G E^ 
and 

i^Y(fi(x)) = J^y(/2(*)) = f(irx(x)). 

But by the hypotheses on [E , E^, vy) the supremum f\(x) U f2{x) exists and 
iry{fi{x) U f2{x)) = f{vx{x)). 

Now assume / is discontinuous at iyx{x). Then there are t/i, t/2 £ such that 
vvivi) = VY{y2) = f{vx{x)) and fi{x) C t/i and f2{x) C t/2- Again, by the 
assumptions on [E , E^, vy), 

fi(x) U f2(x) C t/i U t/2 e 

and U j/2) = fi^x{x)). We conclude that fi and /2 are consistent and 
/i U /2 G . 4 /, that is, . 4 / is directed. 

Let / = LI . 4 /. We need to show that f ^ Af. Let x G be such that / 
is continuous at irx{x) and let g G Af . Then g{x) C f[x) and g{x) G E^ so 
f{x) e E^ and 

t^Y{f{x)) = iry{g{x)) = f{i2x{x)). 

Now suppose / is discontinuous at iyx{x). Let y G E^ be the maximal element 
such that vy(y) = f(nx{x)). The assumptions on [E,E^, 12 y) imply that such 
y exists. Thus g{x) C y and hence 

/(*) = 



proving that f ^ Af . 



□ 
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5 Modelling streams 

In this section we give a semantic model for a set of streams T ^ A using the 
function space of the representing domains. 

Let (Dt,D^,vt) and (Da,D^,va) be domain representations of T and A 
respectively. Then, as described in Section 4.1, Dt and Da capture the orig- 
inal topologies of T and A, or, alternatively, induce topologies on T and A. 
The domain [Dt — t Da\ contains only continuous functions. Thus, by Proposi- 
tion 4.4, any function f:T^A (totally) represented by / G [Dt — t Da] as in 
Dehnition 4.3 is continuous. 

When time T is discrete it is natural to model T by bj or Z (depending on 
the existence of an initial time) with the discrete topology. In this case every 
function from T to A is continuous, i.e., all streams from T to A are continuous. 

Let us now consider continuous time T. In this case we model T by M_|_ 
or M with their usual topologies. Assume that A is a discrete set of data, i.e., 
A has the discrete topology. Then the continuous streams from T to A are 
precisely the constant streams since T is a connected space. Thus all interesting 
streams from continuous time to discrete data are non-continuous. One way to 
deal with such streams, necessary from a view of computability, is apparent 
via the use of domains. A stream / in the complete stream space (T A) is 
represented by a continuous function /: Dt — t Da which gives correct values 
at (the representation of) points of continuity of / and approximate values at 
points of discontinuity (see Dehnition 4.11). 

Here is the formal dehnition. 

Definition 5.1. Let time T and data A have domain representations (Dt , D^ , 
vt) and (Da,D^,va) respectively and let (T — ;> A) be a stream space. Then 
[Dt — t Da] is an approximate domain representation of (T A) if each stream 
ip ^ (T ^ A) has an approximate representation p in [Dt — t Da]- 

To simplify the presentation we make the assumption that discrete time is 
modelled by bj and continuous time is modelled by M. The reader can easily 
modify all arguments for the mentioned variants of models for time. 

As domain representation for time T we choose (bJj_,bJ,id) in the discrete 
case and the standard interval domain representation TZ o{R for the continuous 
case. We denote either representation by Dt- 

Theorem 5.2. Let T be time, discrete or continuous, and let A be a data type. 
Assume that A is a metric space and let (Da,D^,va) be a standard domain 
representation of A. Then each stream 

p:T ^ A 

has an approximate representation p in [Dt — t Da]- 

Thus the complete stream space from T to A has an approximate domain 
representation using the function space obtained from standard domain repre- 
sentations of time and data. Of course, a discrete space is a metric space when 
given a discrete metric. 
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Recall that TZc, the cusl of compact elements of TZ, consists of all closed 
intervals with rational endpoints and M, ordered by reverse inclusion. 

Proof. If T is discrete then each stream ip:T — ;> is continuous. By Theorem 4.9, 

Lp lifts to a continuous function Lp\ Dt — ?■ Da. 

Now we assume that T is continuous time and is modelled by M with the 
standard interval domain representation TZ. 

Let Lp'.T ^ A he d, stream. We dehne Lp\ TZc Da by 

where F° denotes the interior of F . Then p is monotone and extends uniquely 
to a continuous function (p\TZ ^ Da - In fact, 

p(I) = {Fe (Da)c : (3[a,b] e /)(^([a,&]) C T°)}. 

Suppose Lp is continuous at the point x ^ T and consider the ideal = 

{[a,b] : a < X < b} ^ TZ^ . Recall that Ix is the smallest ideal representing x. 
Let J = ‘p(Ix)- We need to show that J E D^ and that va{J) = p(^)- Put 
differently, we need to show that p| J = {p(x)}. Obviously p{x) G p| J. Given 
e > 0, take F E (Da)c such that 

p{x) E F° C F C B{p{x),s), 

where the latter is the open e-ball around p{x). Since p is continuous at x and 
is open there are a', b' E Ff, a' < x < b' and p[{a' , b')] C F° . But then there 
is [a,b] E Ix such that 



p[[a,b]]Cp[[a\b')]CF\ 

i.e., F E J ■ But diam(T) < e and e was arbitrary. R follows that p| 

Now suppose * G T is a point of discontinuity of p. It suffices to show 

I A Ix e p{I) G I(p(^xy 

But this follows directly from the dehnition of □ 

The theorem shows that the domain of continuous functions [Dt — t Da\ 
contains representations of all streams from T to A. The representations are, 
however, only approximate on points of discontinuity. From a computational 
point of view this is quite reasonable. We cannot compute exactly on continuous 
data types, including continuous time, we can only compute on approximations of 
data. At points of continuity we obtain approximations of arbitrary precision. At 
points of discontinuity we can only expect proper approximations. However, with 
an appropriate choice of domain representation this may nonetheless produce 
important information. 

Corollary 5.3. Fach stream p:T ^ A has a best representation p in [Dt — t 
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Proof. The hypotheses of Theorem 4.13 are satished for standard representa- 
tions. □ 

From an approximate domain representation of the complete stream space 
{T A) we dehne an equivalence relation ~ on (T — ;> A) by saying that ~ ^ if 
they have the same best approximation. Thus we obtain a domain representation 
in the sense of Dehnition 4.1 of {T T)/~. 

Finally we consider computability of streams. Note that by Ceitin’s theo- 
rem [7] each computable stream ip:T — t A, where T is a computable metric space 
in the sense of [2], is continuous. Thus in order to consider “computability” of 
non-continuous streams it is necessary to consider approximate representations. 

Definition 5.4. Let (Dt , , vt , ct) and (Da, D^, i'a, P) be effective domain 

representations of time T and data A, respectively. Then a stream Lp:T ^ A 
is computable if there is an (a, /3)-effective p G [Dt — t Da\ representing p 
approximately. 

As an example we consider non-zeno signals into a discrete space A. (Recall 
Dehnition 2.14.) 

We choose the domain representation Vf (A) for A. This is clearly an effective 
domain when A is countable, and the set of maximal elements is decidable. 

Let TZa be the standard interval domain representation of M_|_ obtained as 
the ideal completion of the cusl 

{[a, &] : 0 < a < & and a, b G Q} U 

ordered by reverse inclusion. 

Proposition 5.5. If (p is a non-zeno signal represented by p ^ \J^+ P/(A)] 

then 

Zrp([a,b]) ip[[a,b]]CZ. 

Proof Assume Z C p([o-, b]) and let x G [a, b]. Note that [a, b] ^ D , the largest 
ideal representing x. If p is continuous at x then p(D) = {p(x)}. But [a, b] G D 
so Z D p([a,b]) D {p(x)}. If p is not continuous at x then, by dehnition, we 
have p(D) C {p(x)}, i.e., again p(x) ^ Z. □ 

Proposition 5.6. Suppose p is a computable non-zeno signal represented by the 
effective p G [7?._|_ — t VfiA)]. Then each discontinuity of p is a recursive real. 

Proof. Suppose * G M_|_ is a point of discontinuity. Let a < x < b he rational so 
that p is constant on (a, x) and on (x,b). Let 

I = {[c, d] : (3c', d')(3m, n ^ A)(a < c < c' < d' < d < b, 
p([c, c']) = {m}, p([d' , d]) = {n} and m n)}. 

Note that I is semidecidable. If [c, d] G / then p[(a, x)] = {m} and p[(x, b)] = {n} 
so * G (c, d). Let y be such that a < y < x. Then, by the continuity of p at y, 
there is [c, c'] G ly such that p([c,c'\) = {m}. Similarly, we obtain appropriate 
[d' , d]. That is, I generates the ideal I^. We have shown that is a computable 
element of D and hence that * is a recursive real. □ 
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Finally we note that for a computable non-zeno signal the set of recursive 
reals at which Lp is continuous is semidecidable (recall the exact dehnition in 
Section 3.2). For Lp is continuous at x if, and only if, 

(3[a,&]e4)(^([a,&])eP;(A)^). 

6 Modelling stream transformers 

Now we consider stream transformers F\ [R ^ B) ^ [T ^ A). We have already 
seen that some stream spaces naturally include non-continuous streams. It is 
equally clear that there are natural stream transformers which take continuous 
streams to non-continuous streams. For example, let = 5 be a non-trivial 
discrete data type, R discrete time and T continuous time. Let t:T ^ R and 
dehne F: (R ^ B) ^ (T ^ A) by 

F(p)(i) = p(T(i)). 

Then every stream in [R B) is continuous and hence a non-trivial F takes 
some continuous streams to non-continuous streams, that is, natural stream 
transformers do not necessarily preserve continuity. 

On the other hand there is an intuitive feeling that a stream transformer that 
we would want to model should be continuous in the sense that approximations 
to the value of an output stream at a specihc time should only depend on a 
“hnite” part or approximation of the input stream. In order to make this precise 
we need to model approximations of streams, and for this our method of domain 
representability is natural. 

6.1 Transformations of continnons streams 

In this subsection we consider the following question. When does a continuous 
stream transformer 

F-.C{R^B) ^C(T^T) 

taking continuous streams to continuous streams have a continuous lifting 
F-. [Dr ^ Db] ^ [Dt ^ Da]1 

More precisely, let G{R — t B) and C(T A) denote the spaces of continuous 
streams, where B and A are metric spaces and where R and T is any pair of 
our usual models of time. The spaces of continuous streams G{R — t B) and 
G{T A) are given the compact-open topology. On G(T A) this is the 
topology generated by the subbasic open sets 

W(A', U) = {fG{T^A)\ f[K] C U}, 

where K C T is compact and U C Ais open. By a continuous stream transformer 
F: G(R — 7> B) — 7> G(T A) we mean that F is continuous with respect to the 
compact-open topologies. 
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Let (Da,D^,va) and [Db , Dg , i^b) be standard domain representations of 
A and B respectively. For simplicity in the exposition we choose LJ to model 
discrete time and M to model continuous time. The domain representations we 
consider for time are (LJj_,Id,id) and [TZ,TZ^ , /i) , respectively, where TZ is the 
standard interval domain representing M. 

Given time T and data set A, with the chosen domain representations as 
above, let 

[Dt — t Da]^ = {/ G [Dt — t Da] ■ /[Db] Q D^}- 

We know by Proposition 4.4 that each / G [Dt — t Da]^ induces a unique 
continuous function f:T A. Denoting / by I'if) we obtain the following 
theorem. 

Theorem 6.1. {[Dt — t Da],[Dt — t Da]^,i^) ts an upwards closed domain 
representation of G{T A), when the latter is given the compact-open topology. 

The theorem follows from the fact that T is locally compact and that standard 
domain representations of T and A are used. The proof of the general fact appears 
in Blanck [3]. The special case of T = A = R appears in di Gianantonio [14]. 

In order to answer the question initially posed in this section, we hrst prove 
the following general lifting theorem. 

Theorem 6.2. Let X be a topological space with a dense domain representation 
{E , E^, n). Let T and A be time and data as above with domain representations 
Dt and Da- Then each continuous 

p:X ^ C{T A) 



has a continuous lifting 

f>: E ^ [Dt Da]- 

Proof We consider the case T = M and Dt = TZ. The proof for T = 14 is similar 
but simpler. 

For a ^ Ec Ba = {x ^ E ■. a \Z x}, the basic open set determined by a. 
Dehne p: Ec ^ [TZ ^ Da] by 

i^(«) = l_|{Ur=i(['^G <^i]; Gi) : n > 1 and 

i'Ti=l,...,n){pn[BanE^]CW{[ci,di],G°))}. 

To see that <^(a) is well-dehned consider ([cp d,]; Gj) for i = I and 2 where 
ipn[BaC\E^] C W([cp di], G°). We must show that ([ci, di]; Gi) and ([c 2 , d 2 ]; G'j) 
are consistent. Suppose [ci,di] H [c 2 ,d 2 ] 0 with j/ G M as a witness. Let 

* G 5a n E^, which exists by the density of E^. Then ipn{x){y) G Gi fl G 2 
so ([ci,di];Gi) and ([c 2 ,d 2 ];G 2 ) are consistent. Generalising the argument to 
arbitrary elements in [TZ — t Da]c proves that p is well-dehned. 

It is clear from its dehnition that p is monotone and hence extends uniquely 
to a continuous function p\ E ^ \JZ ^ Da]- We now show that p represents p. 
Let X G E^ and suppose i'{x) = x E X. We need to show that v{p{x)) = p{x) 
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where (\TZ — ?> Da\, \Ti- — ?■ Da\^,v) is the domain representation of C(M — ;> A). 
For this it sufhces to show, for j/ G M, 

Let G G Iip(x)(y), that is, ip{x){y) G G° . By the continuity of ip{x) there 
is [c,d\ G ly such that ip{x) G W([c, d], G°). By the continuity of Lp the set 
d], G°)] is open. Choose a ^ Ec such that 

ieBaCiE^C jy-VMW([c,d],G°)]. 



Thus a G X and 

Lp{x) = ipi^{x) e pv[Ba n E^] C W([c, d], G°). 



But then 

([c, d];G) C Lp{a) C (p{x) 

and hence, since [c, d] G dy, G C p{x){Iy). This shows that Itp(x)(y) E p{x){Iy). 

□ 

In view of the above theorem it sufhces, for the problem at hand, to consider 
sufhcient conditions for density of the domain representation [Dr Dr] of the 
space of continuous streams C(R — t B). We hrst consider the case when R is 
discrete, i.e., R = M. 

Lemma 6.3. [bJj_ — t Dr] ts a dense representation o/C(bJ — t B). 

Proof. Suppose Ufci(^b-Pj') C [Nj_ — t Dr]c- Without loss of generality, recall- 
ing the dehnition of a closed neighbourhood system, we can assume the n,- are 
distinct. Choose Xq ^ B and Xi G Ei. Then dehne /:bJj_ — t Dr by 

r/ N _ f 4. LI [Ei], if n = np 
“ I 4^, if n 7 ^ rip for 

and /(T) = T. Clearly L|Li(”C E / and / £ [Nj_ -t Dr]^. 

In case (T;T) appears in the compact element we alter the dehnition of / 
by choosing Xi ^ E C\ Ei, Xq G E, and setting /(T) = [T]. □ 

An effective domain representation (D, D^ , v, a) is effectively dense if there 
is a total recursive function d which given an a-index of a ^ Dc computes an 
d-index of some x G D^ such that a G x. That is, for each n E vj , 

a(n) C ad(n) E D^. 

It follows from the proof of Lemma 6.3 that if Dr is an effective domain repre- 
sentation of B then the domain representation 

{[Mr^ Dr],[Mr^ DRf,n) 



of C(LJ — 7> B) is effectively dense. 
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Corollary 6.4. Let R be discrete time, let T be discrete or continuous time, 
and let A and B be metric spaces. Then each continuous stream transformer 

F:C{R^B) -^C{T^A) 



lifts to a continuous 



F: [Dr -p- Dr] — [Dt —5- Da]- 

The case when R is continuous time, i.e. R is modelled by M, is more delicate. 
The density of the domain representation [TZ — t Dr] depends on topological 
properties of the metric space B and the standard domain representation Dr 
of B. Note that if 5 is a discrete metric space, say hj with standard domain 
representation (hj j_ , hJ , id) , then \JZ — t hJj_] is not a dense representation. For 
example, 

([0,1];1)U([2,3];2) 
has no representing element above it. 

Below we let TZ be the standard interval domain representation of M and we 
let (E,E^,iT) be a standard domain representation of a metric space X. For 
each [a,b] G TZc and F ^ Ec ve use the notation 

(([a, b]; F)) = {/: M — ;> N | / is continuous and /([a, b]) C F}. 

Lemma 6.5. Let f £ 0"=! (([“*> Then {([a,-, &,•]; T,-) : i = l,...,n] is 

consistent in [TZ — t E] and there is / G [7?. — t E]^ representing f such that 

n 

\_\{[ai,bi];Fi) Q f. 

i = l 

Proof If * e for I C {!,..., n} then f(x) £ flig/ proving the 

consistency statement. 

Let / G [7?. — 7> if] represent /. The function space representation is upwards 
closed so it sufhces to show that / and [Ji-i{[o.i,bi\, Ff) are consistent. For the 
latter it sufhces to consider consistency with each compact approximation of /. 
So suppose \JiLi{[ci,di]\Gi) C /. Let i C {1, . . .,n} and J C {1, . . .,m}, and 
suppose 

iei jeJ 

with * as a witness. Then f(x) G HiG/ Ti by hypothesis. Consider the maximal 
ideal D representing x. Thus f{D) represents f{x). But for each j ^ J , x ^ 
\cj,dj] and (\cj,dj]:Gj) C / so Gj C f(D). But then fix) = u(f(D)), that is, 
f(x) e Gj. We have shown that Ti) C (fljgj Gj) 0. □ 

Definition 6.6. A space A is arcwise connected if for every pair of distinct 
points *i ,*2 G T there is a continuous embedding /i: [0, 1] — t A such that 
/}(0) = Xi and h[l) = * 2 - 
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Lemma 6.7. Let X be a metric space with a standard representation [E , E^, p). 
Suppose X IS arcwise connected and each E ^ Ec is arcwise connected. Then the 
function space representation (\TZ — ?> Ef \TZ — ?> £']‘^, 7 ) is dense. 

Note that the representation 7?. of M satishes the hypotheses. Similarly there 
is a standard representation of M" satisfying the hypotheses. 

Proof It sufhces to hnd / £ nr=i(([“T -^i)) for each Ur=i([“*'> ^ 

by Lemma 6.5. Fix such a compact element. Partition {1, . . . , n} into Ii, . . . , Ik 
such that [Ji^j [ui,hi] is connected but 



( U [ai,bi]) n ( IJ [ai,bi]) = 0 

i^Ij i^Ii 

for i 1. Then Uig/ = [dj,bj] and we may assume that bj < hj+i for 

j = l,...,k-l. 

Assume we have constructed fj ^ Hi 6 b {{[ai,bi]; Ei)) for j = 1, . . . , k. Since 
the space A is arcwise connected there are continuous functions gj for j = 
I, . . . , k — I such that gj (bj) = fj (bj) and gj{djj_i) = /j_|_i(aj_|_i). These functions 
can now be glued together to dehne 

' /i(ai), if * < hi; 

firi = ® = 

I Ebj <x <aj+iJ =l,...,k-l; 

. fk (bk), if bk < X. 



Clearly / is continuous and / G fV=i{{[(^i>bi];Ei)). 

Thus it sufhces to consider the case when Ur=ik'>M is connected. Let Ci < 
C '2 <■■■< Ck he a strictly increasing listing of the set {ai , . . . , a„} U {&i, 

For each j, choose dj ea, 3’iid let Ij — |i . (cj, Cj_|_x) C [cif, &*]}• Then 

Ij 7 ^ 0 and dj^dj^i G Hi 6 b Ei. By assumption Hi 6 b Ei is arcwise connected, 
since Ec is closed under intersection, so there is a continuous function gj such 
that gj{cj) = dj, gj{cjj.i) = dj+i, and gj{[cj,Cjj.i]) C 016/- dehne 

/:M^Aby 

i di, if * < Ci; 

gj{x), if Cj <x < Cj+i,i =l,...,k-T, 

dk, if Ck < X. 

Then / is continuous and / G a=i((kai];e)). □ 

It is worth noting that for A = M and E = TZ, the representation 



of C(M — ;> M) is effectively dense. 
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Theorem 6.8. Let R be continuous time, let T be discrete or continuous time, 
and let A and B be metric spaces with standard domain representations Da and 
Db- Assume B is arcwise connected and each F G {Db)c arcwise connected. 
Then a stream transformer 

F:C{R^B) -^C{T^A) 

IS continuous if and only if F has a continuous lifting 
F: [Dr -p- Db] — [Dr —5- Da]. 

Proof. By Proposition 4.4, Lemma 6.7 and Theorem 6.2. □ 

In particular we have 

Corollary 6.9. A functional L’:C(M — ;> M) — ;> C(M — ;> M) is continuous with 
respect to the compact-open topology if and only if F has a continuous lifting 
F: \tZ -^TZ] [TZ-^TZ]. 

D. Normann [35] has recently extended our results about density and lifting 
to the whole hnite type structure over 7Z. 

To summarise, we have shown that a continuous stream transformer 

F:C{R^B) -^C{T^A) 



has a continuous lifting 



F: [Dr Db] — [Dr — t Da]- 

if R is discrete time or if R is continuous time and 5 = M or M". 

In the remaining case when R is continuous time and 5 is a discrete space 
then C(R — t B) is homeomorphic to B. 



6.2 Simple transformations of non-continnons streams 

The characterisation theorems in Section 6.1 provide domain representation 
methods for many examples of transformations acting on continuous streams 
C(R — 7> B). However, transformations of signals with discrete data, such as sig- 
nals from M_|_ to 14, which are necessarily discontinuous when non-constant, are 
not covered though they are important in system modelling. 

In Section 5 we saw how to provide domain representations of discontinuous 
streams (Theorem 5.2) and, in particular, the class of non-zeno signals. Now we 
will look at their transformations: we will show how to provide domain represen- 
tations for stream transformations that are single or multiple access on non-zeno 
signals, see Section 2.2. 

First we need to be precise about what we mean by a stream transformer on 
non-zeno signals having a domain representation. 
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Below we let TZj^ denote the standard closed interval domain representation 
of M_|_ and we let A and B denote discrete spaces with, for simplicity, flat domain 
representations. The stream space of non-zeno signals over A is denoted by 

NZ(M+ ^ A). 

For non-zeno signals it is reasonable to modify the notion of approximate 
representation as follows. 

Definition 6.10. The function (p G — ?■ Aj_] is a non-zeno representation of 

p e NZ(M+ ^ A) if 

(i) for each i > 0, the set {t' G [0,t]: p{It') = T} is hnite and does not contain 
0, and 

(ii) p represents p exactly for all t such that p{It) 7^ T. 



The set {t G M_|_: p{It) = T} is called the exceptional set for the representation 
p of p. 

Definition 6.11. Let T:NZ(M_|_ -P- B) ^ NZ(M_|_ — ;> T) be a stream trans- 
former on non-zeno signals. Then 

F: [TZ+ — 7> 5j_] — 7> [TZ+ — t Tj_] 

is an approximate non-zeno representation or lifting of F if F is continuous and 
whenever p G \F+ -®J-] i® ^ non-zeno representation of p ^ NZ(M_|_ B) 

then F[p) is a non-zeno representation of F{p). 

Recall from Section 2.7 that single or multiple access signal operators with 
strict retimings take non-zeno signals to non-zeno signals. 

Theorem 6.12. Let F: NZ(M_|_ B) ^ NZ(M_|_ — t A) be a single access stream 
transformer with respect to ir: B ^ A and a strict retiming r: M M_|_, where A 
and B are discrete spaces. Then F has an approximate non-zeno representation 

F: [7?._|_ — 7> Bj_] — 7> [7?._|_ — 7> Tj_] 

Proof Dehne F: [7?._|_ — t 5j_]c — t [7?-+ — t A±] by 

k 

d]; 7r(*)): (3i)(* = Xi k 

*■ = 1 

[0 < ai k [c,d] C (r“^(ai), r“^(&i)) or 
a,- = 0 < hi k [c, d] C [0, 'r“^(&i))])|. 

It is routine to verify that F is well-dehned and monotone and hence F extends 
continuously to 

F: [7?._|_ — 7> Bj_] — 7> [7?._|_ — 7> Tj_]. 
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Let If G [7?._|_ — 7> 5j_] be a non-zeno representation oi f a NZ(M_|_ B). Let 

ti < ^2 < • • • be the exceptional set for the representation f of f. We claim 
that F[(p) is a non-zeno representation of F(ip) with exceptional set < 

T-^h) < ... . 

Suppose ‘,£’(0) = X and choose (i such that 0 < (i < ti. Then ([0, (i]; *) C so 
there is d such that 



([0,d];7r(*)) C T(([0,(i];*)) C F{f). 



Thus 

7r(*) = ([0,d];7r(*))(/o) C F{f){Io), 
that is, F{f){Io) = F{f){0). 

Similarly, F{f){It) = F{ip){t) when t r“^(L) for any i. 

Now suppose t = T~^{ti) for some i. Then = T. It then follows from 

the dehnition of F that F{f){It) = T. □ 

Suppose {A, a) and {B,[3) are computable structures. Then (A^,a) and 
(5j_,/3) are effective domains with numberings obtained from a and /3 in a 
canonical way. It follows that [7?._|_ — t Tj_] and [7?._|_ — t 5j_] are effective do- 
mains with numberings obtained from a, /3 and a standard numbering of 
Analysing the proof of Theorem 6.12, in particular the dehnition of F , we obtain 
the following theorem (with numberings suppressed). 

Theorem 6.13. Let A and B be computable structures and let T:NZ(M_|_ 

B) NZ(M_|_ — t A) be a single access stream transformer with respect to tt: B ^ 
A and a strict retiming r:M_|_ M_|_. Assume that ir is computable and r is 

effective. Then F has an effective approximate non-zeno representation 

F: [7?._|_ — 7> 5j_] — 7> [7?._|_ — 7> Aj_] 

Remark 6 .I 4 . Theorems 6.12 and 6.13 easily extend to multiple access stream 
transformers on non-zeno signals with essentially the same proof. 

7 Concluding remarks 

We have given an introduction to streams and stream transformations with an 
emphasis on transformations of both discrete and continuous time streams and 
their connections. We have posed the problem of creating a unihed semantic 
framework for analysing the computability of the 16 different kinds of stream 
transformations 

F:{R^ B) -^{T ^ A) 

that depend upon whether time T and R, and data A and B, are discrete or 
continuous. 

In this paper we have given a solution to the problem. It is based on the 
theory of algebraic domain representations for topological spaces and algebras. 




Streams, Stream Transformers and Domain Representations 



65 



Specifically, we have demonstrated that the domain methods can be successfully 
applied to all cases of transformations of continuous streams. 

In addition, we have explored the problem of representing discontinuous 
streams, such as commonly arise in models based on signals T ^ A, where time 
T is continuous and data A is discrete. The computability of transformations of 
discontinuous streams is an interesting subject about which little is known, at 
present. 

Other general approaches to computability in topological spaces may pro- 
vide alternate solutions to the problem of creating a general theory of stream 
computability. For example, effective metric space theory [34] or Weihrauch’s 
TTE [57] may be applied, although these approaches are known to be equivalent 
with algebraic domain representations [48]. 
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Abstract. We provide some mathematical properties of behaviours of 
systems, where the individual elements of a behaviour are modeled by 
ideals, ie. downward closed directed subsets of a suitable partial order. 
It is well-known that the associated ideal completion provides a simple 
way of constructing algebraic epos. An ideal can be viewed as a set of 
consistent finite or compact approximations of an object which itself 
may be infinite. A special case is the domain of streams where the finite 
approximations are the finite prefixes of a stream. 

We introduce a special way of characterising behaviours through sets 
of relevant approximations. This is a generalisation of the technique we 
have used earlier for the case of streams. Given a set P C M of a partial 
order (M, <), we define 

ideP := {Q- ■ Q ^ P directed} , 

where Q- \= {x € M \ 3y € Q \ x < y] the downward closure of Q. So 
ideP is the set of all ideals “spanned” by directed subsets of P. We prove 
a number of distributivity and monotonicity laws for ide and related 
operators. They are the basis for correct refinement of specifications into 
implementations. Various small examples illustrate that the operators 
lead to very concise while quite clear specifications. 

Finally, we give a characterisation of safety and liveness and generalise 
the Alpern/Schneider decomposition lemma to arbitrary domains. 

An extended example concerns the specification and transformational 
development of an asynchronous bounded queue. 



Part I: Introduction 

1 Origin and Goals 

The context of this work is deductive program design, in which implementations 
are derived from specifications by semantics-preserving deduction rules. Exam- 
ples of this paradigm are transformational program development (eg. [53, 8]) and 
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the refinement calculus (eg. [16,23,5,2,46,47]). There is a growing conviction 
that this paradigm is most efficient when based on algebraic rather than purely 
logical frameworks. The aim there is to make program specification and calcu- 
lation more concise and perspicuous by compacting logic into algebra as much 
as possible. For sequential programs this is demonstrated eg. in [39,41,8]. 

In the parallel case, to some extent the work reported in [50, 14] can be viewed 
as falling into the algebraic realm; purely algebraic approaches are presented in 
[31,56]. The present paper presents a particular approach to streams (see eg. 
[30,49,12] and [62] for a recent survey). It centres around the order-theoretic 
view of streams and other semantic objects as used in denotational semantics. 
In addition to order theory we use a suitable algebra of formal languages [39] in 
reasoning about streams. 

To exhibit a certain uniformity we use the same calculational style of rea- 
soning in the parts treating the mathematical background as in the program 
derivations proper. 



2 Streams and Ideals 

The basic tool in our approach is the prefix order on finite words in A* over 
some alphabet A of basic actions, data or states. These words are considered as 
initial parts of system traces. A trace language is directed w.r.t. this order iff it 
is totally ordered by it. Therefore ideals, ie. prefix closed directed sets of traces, 
are a suitable representation of finite and infinite streams. 

It is well known that the space of streams under the prefix ordering is isomor- 
phic to the ideal completion of the set of finite streams. Since, however, ideals 
are just particular trace languages, we can use all operations on formal languages 
for their manipulation. A large extent of this is covered by conventional regular 
algebra. Moreover, we can apply the tools developed for quite different purposes 
in a number of papers on algebraic calculation of graph, pointer and sorting 
algorithms (see [39,41] and the references there). Finally, we do not need ad- 
ditional mechanisms for dealing with fairness; rather, fairness is made explicit 
within the generating expressions for trace languages. 

Using regular expressions rather than automata or transition systems gives 
considerable gain in conciseness and clarity, both in specification and calcu- 
lation. While this has long been known in the field of syntax analysis, most 
approaches to the specification of concurrency stay with the fairly detailed level 
of automata, thus leading to cumbersome and imperspicous expressions. Other 
approaches use logical formulas for describing sets of traces; these, too, can be- 
come very involved. By extracting a few important concepts and coming up 
with closed expressions for them one can express things in a more structured 
and concise form. This is done here using regular and regular-like expressions 
with their strong algebraic properties. The approach can also be nicely tied in 
with temporal and modal operators (see [45]). 

Another advantage of our approach is that we can do with simple set-theoretic 
notions thus avoiding most of the overhead of domain theory. By this, the ap- 
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proach also is completely orthogonal w.r.t. nesting of data structures, ie. it ad- 
mits streams of functions, streams of sets, sets of streams, streams of streams 
etc. without problems. 

3 A Simple Soda Machine 

To show the style of our approach and in order to better motivate the techni- 
calities to come we first give a number of examples with informal explanations. 
The precise definitions will be given in later sections. 

We start with the description of a simple soda machine. It accepts half dollars 
and quarters and emits a can of soda after having received a half dollar’s worth 
in coins. Let h and q denote the events of receiving a half dollar and a quarter, 
respectively, and c the event of emitting a can of soda. Then the behaviour of 
that machine is described by the regular-like expression 

((/i U g • g) • cY , 

where • is concatenation and denotes infinite repetition. Each expression of 
this kind denotes a set of (finite or infinite) streams; in the case of the soda 
machine all these streams are infinite. 

In the above expression, the iterated subexpression (/i U g • g) • c states 
the following safety properties: the customer must insert the correct amount of 
money and is not allowed to insert further money before delivery of the can. 
The infinite repetition combines safety and liveness aspects: it expresses the 
correct order of insert /deliver cycles, a safety property, and expresses the tem- 
poral aspect of eventuality (see eg. [22]): it guarantees that after insertion of 
a sufficient amount of money eventually a can is delivered and the machine is 
ready to accept further orders. 

We prefer to leave states implicit as long as possible, since frequently regular 
expressions are clearer and more concise than the corresponding descriptions by 
accepting automata (Biichi automata in the case of infinite repetition, see eg. 
[52,65,66]). 

4 Fairness 

Other eventuality properties can already be expressed by Kleene’s finite repe- 
tition operators _* and _+. To exemplify this, we describe a scheduler for un- 
boundedly fair merging of input from two channels. It is modeled as an infinite 
stream over the alphabet {0, 1}, where 0 denotes choice from the left and 1 choice 
from the right input channel of the merge module. A sequence in which there 
is at least once a choice from the left followed eventually by a choice from the 
right is described by the regular expression 0+ • 1. By adding the symmetric 
requirement and, again, infinite repetition to drive the single cycles, we get the 
following description of the set of streams that model the behaviour of a fair 
scheduler: 

scmv =*' (0+ • 1 U 1+ • 0)“ . 
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The “local eventuality” is here expressed by the finiteness of whereas the 
infinite repetition again adds liveness and “global eventuality” . 

Arbitrary (and hence possibly non-fair) merge would be obtained by replacing 
this scheduler by (0 U l)^. 

The reason why fairness does not cause problems in our approach is that 
fairness constraints are expressed using the star operation which has a simple 
recursive definition using least fixpoints w.r.t. the inclusion ordering on sets of 
streams, whereas there are continuity problems w.r.t. extensions of the prefix 
order to sets of streams. This is due to the fact that the prefix order has op- 
erational traits and unbounded fairness is operationally not feasible, whereas 
the inclusion ordering is purely descriptive and hence does not face this prob- 
lem. It is adequate for proving properties of sets of streams; when it comes to 
implementation, of course operationally feasible descendants have to be used. 

We prefer to state fairness assumptions explicitly, since this gives much 
greater flexibility than building them into the underlying semantic framework 
(such as eg. in [17]). 

5 Channels 

Another aspect of fairness and eventuality is exhibited in the description of 
channels as used in many protocol specifications. The channels are faulty, but fair 
in the sense that after an unbounded but finite number of faulty transmissions 
they will at least once transmit correctly. 

We will describe their behaviour using streams of functions. Each such func- 
tion models the transmission behaviour at one particular instance of time. The 
identity function id models correct transmission, fail transforms any message 
into an “error element” and skip transforms any message into empty output. Let 
in the sequel stand the sub / superscript i for * (unbounded but finite repetition) 
or < k for some A: € IN (bounded repetition). Then the following specifications 
express unbounded and bounded fairness, respectively. 

A possibly corrupting but fair channel is described by 

ccharii {fail’' • idf^ , 

a possibly lossy but fair channel by 

Ichaui {skip^ • idY 

and a possibly lossy and corrupting but fair channel by 
Iccharii {{skip U faiiy • id)^ . 

An unfair corrupting channel is 

arbchan {fail U id)^ . 

This kind of channel descriptions has been used in [42] for a very concise 
algebraic correctness proof of the alternating bit protocol. 
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6 Two Stream-Based Models of Systems 

6.1 Modules as Stream-Processing Functions 

A stream-processing function (SPF) is a function from tuples of input streams 
to tuples of output streams (see eg. [30, 12]). In the case of synchronous systems 
this may equivalently be replaced by a function from a stream of input tuples 
to a stream of output tuples. 

In the SPF view each module is described as an SPF. The advantage of this 
model is that it allows easy definitions of various composition operations for 
modules and hence lends itself to a modular structuring of large systems. 

The disadvantage in the description of asynchronous systems is that the 
separation between input and output streams loses causal information, viz. which 
input triggered which output. This gives rise to the (in)famous merge anomaly 
[9] which can be fixed by re-introducing time information into the streams. It 
has to be expressed which elements of a stream are considered to belong to the 
same time interval. This can be done using explicit time ticks [11, 63] or streams 
of sequences where each sequence lists the elements that belong to one time 
interval [32]. 

6.2 Trace Models 

In the trace view, the overall system is described by admissible sequentialisations 
of actions during system runs (interleaving semantics). If the structure of non- 
deterministic branching is preserved, one obtains tree-like semantic domains such 
as used in CCS [36], CSP [26] and process algebra [3]. In the simplest case, 
however, the trace structure is a set of streams (see also [29]) which we term 
a behaviour. In this view, a stream in A°° is a complete record of one possible 
system run with all system actions interleaved. 

Eg. for a CSP-like view one uses the alphabet A = C xV oi basic actions, 
where C is a set of channel names and V a set of values that are transmitted 
along the channels. Then the streams in A°° are complete records of system runs 
with all channel activities interleaved. 

The advantage of this view is that it keeps track of the causality between 
input and output; hence the merge anomaly does not arise. 

The disadvantage is a loss in modularity, since only the overall system is 
described directly. Modularisation can be re-introduced, though, by restricting 
attention to subsets of channels. 



Part II: The Algebra of Ideals 

This part reviews a few order-theoretic notions and provides some auxiliary 
facts. It then goes on to develop the ideal-theoretic basis of our approach. On 
first reading, this part may be skipped and consulted later for details as the need 
arises. 
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7 Mathematical Background 

7.1 Order-Theoretic Preliminaries 

In this section we repeat some basic notions from the theory of partial orders. 
Some useful properties of the operations introduced are given in the Appendix. 
Proofs not contained in the present paper can be found in [42]. 

For partially ordered set {M, <) and N C M we define the proper and 
improper downward closure by 



= {y & M : 3 X & N : y < x} 
N^=^{yeM-.3xeN-.y<x} = N U N< 

where y<x-^y<xAx^y. 

The set of maximal elements of A" C M is defined by 

max A A\A< . 

We now extend the order < to a relation on subsets of M by 

N <P N C . 

This is the angelic half of the Egli-Milner pre-order [54]. In particular, A- < A. 

Since < generally is only a pre-order between sets, we are interested in the 
induced equivalence relation 

N r. P N <P A P <N . 

A subset A C M is a cone if it is downward closed, ie. if A- C A. Hence 
on cones < and C coincide; in particular, < is a partial order on cones. 

Since M is a cone and the intersection of cones is a cone again, the set of all 
cones forms a complete lattice under inclusion. It is isomorphic to the angelic 
or Hoare power domain [58] over (M, <). However, we are not going to use that 
domain. 



7.2 Pointwise Extension 

In the sequel we will define many functions on single points of M and lift them 
to subsets of M by pointwise extension, ie. by setting, for / : M M and 
N C M, 

/(A) {fix) -.xeN}. 

These pointwise extended functions distribute through arbitrary unions and 
hence are monotonic w.r.t. inclusion and strict w.r.t. 0. We will also use this 
mechanism to lift these functions a further level to sets of subsets of M. 

Pointwise extensions inherit linear laws. These are laws of the following form: 
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— Equational laws in which all variables occur exactly once on both sides of 
the equality sign. Examples are the laws of neutrality, associativity and com- 
mutativity over a groupoid. 

— Implications with element relations as atoms in which all variables occur 
exactly once on both sides of the implication sign. In the inherited form the 
variables for elements turn into variables for non-empty sets and the element 
relations turn into inclusions. An example is 

s £ s => s £ s A t £ s 



which lifts to 

5y^0ATy^0A5*TCe ^ S C e A T C e . 

7.3 Directed Sets and the Ideal Completion 

A subset N C M is directed if every finite subset of N has an upper bound 
in N. Equivalently, N is directed if N ^ 0 and any two elements of N have 
a common upper bound in N. Hence every two elements of a directed set are 
consistent in that they approximate a common element. 

Eor P C M we denote by dirP the set of all directed subsets of P. Note 
that the operation dir is monotonic w.r.t. inclusion. Some further properties of 
dir can be found in the Appendix. 

To tie our approach in with domain-theoretic notions (see eg. [64] we recall 
the ideal completion (cf. eg. [6, 19]). Consider an ordered set (M, <). An ideal is 
a directed cone. The set of all ideals is denoted by I{M). 

The partial order (M, <) is called complete or a cpo iff every directed set 
D C M has a supremum (or least upper bound) Li D £ M. An element x of M 
is finite (compact) iff for every directed set D C M with x < LiD we have also 
X < z for some z £ D. Equivalently, x is finite iff for every ideal I C M with 
a; < U7 we have x £ I. (M, <) is algebraic iff every element of M is the supremum 
of a directed set of finite elements. A non-finite element of an algebraic set is 
called a limit point or an infinite element. With these notions one has 

Theorem 7.1. 1. The set (I(M), C) ordered by set inelusion is a epo and 
algebraie, the finite elements being the principal ideals x- for x £ M . The 
mapping i : x x- is an embedding of M into I(M). 

2. For every monotonie mapping h : M ^ P into a epo (P, <) there is a unique 
eontinuous mapping h : I(M) P extending h, ie. with h(x-) = h(x). h 
is given by h(I) = U h(I) for I £ I(M); henee h(D-) = U h(D) for direeted 
D C M. 

The ordered set (I{M), C ) is called the ideal completion of (M, <). We set 
M°° I(M). An ideal in is non-compact iff it does not have a maximal 

(and hence greatest) element. 
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8 Streams as Ideals 

We now make our notion of streams precise. Assume an alphabet A of atomic 
actions, data or states. Then, as usual. A* is the set of all finite words over A. By 
e we denote the empty word, whereas concatenation is denoted by •. A subset 
of A* is called a (formal) language. 

A word u is a prefix of a word v, written u C u, iff there is a word w such that 
u»w = V. It is well-known that this defines a partial order on words which is even 
well-founded. Moreover, e is the least element in this order. The corresponding 
strict-order is denoted by IZ . A cone of (A*, C ) is then a prefix-closed language. 
Note that every non-empty cone contains e. 

A few properties we shall use are the following (where x,y,u,v,w € A* and 
U,V C A*): 

V Q w u • V n u •w , 

y 0 ^ (^u •V)^ = U U •V^ . 

Property (2) is also called local linearity. 

Informally, a stream over A is a finite or infinite sequence of elements of A. 
The basis of our approach is the observation that such a stream is completely 
characterised by the set of its finite prefixes. This set is downward closed w.r.t. 
C , ie. a cone. Moreover, it is directed, since in the partial order (A*, C ) by local 
linearity the directed sets can be characterised another way: 

Lemma 8.1. D C A* is directed w.r.t. C iff D is totally ordered by C, ie. iff 
for any two elements u,v G D we have u Q v or v Q u. 

Hence an ideal of (A* , C ) is a totally and prefix-closed non-empty language. 
Note that every ideal contains e. Therefore an ideal is a set of words of increasing 
length “growing at the right end”. This set may be finite or infinite. A simple 
example is, for a € A, the infinite ideal 

a* = {e, a, a* a, a* a* a, a • a • a • a, . . .} . 

We identify a stream with the set of its finite prefixes. By the above, this set 
is an ideal of (A*, C ). Therefore we call the elements of A°° streams over A. It 
should be noted that the compact elements of A°° correspond to the elements 
of A*; hence, for countable A, the set (A°°, C) has a countable basis of finite 
elements and therefore is countably algebraic. The length of stream S is denoted 
by |5|; it coincides with its cardinality minus one. Let us give a characterisation 
of infinite streams: 

Lemma 8.2. A stream S is infinite iffmaxS = 0. 

Proof. First, by linearity of the prefix order on a stream and by its well-founded- 
ness, an infinite stream cannot have a maximal element. The reverse implication 
is provided by Lemma 12.1.2. □ 



( 1 ) 

(2) 

(3) 
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The compact elements of A°^ correspond to the elements of A*, whereas 
the non-compact elements are precisely the (cardinally) infinite ideals. They 
correspond to infinite sequences over A. 

To resume our previous example, the ideal 

a* = {e, a, a* a, a* a* a, a • a • a • a, . . .} 
is the limit (supremum) of the set of finite ideals 

{{a* : i <n} : n G IN} 
corresponding to the C -increasing set 

{a” : n G IN) 

of finite words. It may thus be viewed as a representation of the infinite stream 
of as. This observation is the main motivation for our approach; it allows us to 
work with infinite streams by manipulating their sets of finite approximations, 
since in the ideal completion each (finite or infinite) element is identified with 
the set of its finite approximations. This allows carrying over all laws from the 
algebra of formal languages to streams. Of course, the fact that the set of finite 
and infinite streams is isomorphic to the ideal completion of the set of finite 
streams is well-known; what is new here is the direct algebraic manipulation of 
the ideals using those laws. 

While our approach was motivated by the particular case of streams, we 
will perform the mathematical development as far as possible for general ideal 
completions. 

9 A Setting for Non-Interleaving Semantics 

To stress that latter point and to illustrate our approach with a different setting 
we now sketch how partial-order semantics, allowing true concurrency, can be 
accommodated in our setting. 

Let i? be a set of events. Then a history over i? is a partial order {F, A) 
with a finite subset F C E oi events. The order ^ models temporal/causal 
dependence. Two events not related by ^ are considered as parallel/concurrent. 

Let now H{E) be the set of all histories over E. We define an approximation 
ordering < on H{E) by 



(Fi,^i) < (F 2 ,^ 2 ) Fi C F2A 

^1 = ^2 n Fi X F2 A 

V a; G Fi : x-^ = x-'^ A 

V y G F2 : 3 X G Fi : X <2 y ■ 

This is the appropriate generalisation of the prefix relation on words to histories. 
It means that Fi is embedded as a cone into F2 and F2 may only add “later” 
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events. It is straightforward to check that this indeed defines a partial order. 
The least element is ( 0 , 0 ). 

A chronicle now is an ideal in (H{E),<), and infinite chronicles generalise 
infinite streams. The case of streams is retrieved if one only considers histories 
that are linearly ordered by ^ ; in that case ^ corresponds directly to C . In the 
present paper, we shall not pursue this example further, though. 



10 Behaviours and Refinement 

Our application of ideals will be the description of systems. To model non- 
determinacy, we define a behaviour to be a set of ideals. 

It should be noted that using sets of ideals as behaviours allows only “trace- 
like” semantics in which there is no distinction between internal and external 
non-determinacy. The algebraic reflection of this is that concatenation, our se- 
quencing operation, distributes through union both from the left and from the 
right. In algebraic approaches to CCS-like systems (see eg. [36,3]) only one of 
these distributivities holds. This results in models with tree-like objects that 
reflect the non-deterministic branching structure in time. This detailed record is 
lost by admitting both distributivities rather than just one. 

The set of finite prefixes of a behaviour B is 

prefK =*' IJe . 

Clearly, pref distributes through union and hence is C -monotonic. 

As our refinement relation we choose inclusion, ie. behaviour B refines be- 
haviour C if K C C. For instance, given a property P C M, the set ideP of ideals 
satisfying P, is a behaviour. To allow correct local refinements one therefore has 
to ensure monotonicity of all operations w.r.t. inclusion. 

Example 10.1. We resume the example from Section 4 and show that bounded 
fairness refines unbounded fairness: since all operators involved are monotonic 
w.r.t. inclusion, we obtain from a • a-* C a+ for a € A that 

(0 • 0^* • 1 U 1 • 1^* • 0)“ C SCUEV . 



□ 



11 Describing Behaviours by Properties 

We want to characterise ideals by certain sets of “relevant” or “admissible” finite 
approximations. Such a set, ie. a subset of our overall partially ordered set M, 
is called a property in this connection. 

In the particular case of streams the finite approximations are “snapshots” in 
the form of finite words in A*. Assume a set [7 C A* of admissible snapshots. If 
a stream contains a subset D C U oi snapshots then D has to be directed. How- 
ever, there may be arbitrary “gaps” between the snapshots in D. To reconstruct 
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the stream we therefore have to “fill in the details” between the snapshots. This 
is done by taking the prefix closure D-. Hence we define the set of streams, ie. 
the behaviour, spanned by snapshot set U as 

str[7 {D^ : D £d\rU} . 

This is the set of streams that “interpolate” consistent snapshots in U. A related 
notion occurs in [20]; the connection will be made precise in Section 12. 

We generalise this to arbitrary partial orders and their ideal completions. Let 
{M, <) be the partial order of finite approximations. For property P C M we 
now define by 

ideP =*' {D^ : D G dirP} 

the set of all ideals “spanned” by directed subsets of P. Note that ideM = I{M). 
Moreover, ide is monotonic w.r.t. inclusion. A different characterisation of ide is 
given by 

Lemma 11.1. For I G I(M) and Q C M the following statements are equiva- 
lent: 

1. I e ideQ. 

2. I c {inQp. 

3. I = 

Proof. The equivalence of 2 and 3 is obvious by monotonicity of - and downward 
closedness of L 

(1 =1- 2) Suppose I = D- for D G dirQ. 

I 

= -{[ assumption ]} 

= {[ since P C Q ]} 

C -{[ monotonicity ]} 

= {[ assumption ]} 

(/nQ)< . 

(3 =1- 1) Since I is directed, so is {iCiQ)-. By Lemma 30.5.3 also iCiQ is directed 
and the claim follows. □ 

We have the following distributivity property for ide: 

Lemma 11.2. Consider N,P C M. Then 



ide (A" U P) = ideA U ideP . 
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Proof. I e ide {N U P) 

{[by Lemma 11.1 ]} 

I = {in{NuP)p 

{[ distributivity of fl over U and Lemma 30.1.1 ]} 

I = (inN)^ u (inP)^ 

=> {[by directedness of I, Lemma 30.5.2 and Lemma 30.3.3 ]} 

I = (inN)^ V I = {in P)^ 

{[by Lemma 11.1 ]} 

7 e ide V 7 e ide P . 

The reverse inclusion follows by monotonicity of ide. 

Another proof can be given using Lemma 30.5.5. □ 

This also shows once again the monotonicity of ide. We have even 

Corollary 11.3. C P ide C ideP. 

Proof. The inclusion from right to left is part of Theorem 7.1.1 via the principal 
ideals x- for x € M. □ 

It should be noted, however, that ide only distributes through finite unions 
and hence is not “continuous”. For an instance of this see Example 13.3 below. 

Lemma 11.4. We have the following properties eoneerning downward elosure: 

1. I £ ide(P^) ^ I C P^. 

2. prefideP = P-. 

3. ideQ C ideQ-. The reverse inelusion is not valid. 

Proof. 1. {=>) Assume 7 = D- for D € dir(P-). Then, by monotonicity and 
idempotence of - we get D- C (P-)- = P-, ie. 7 C P-. 

(<^=) Straightforward, since 7 C P- implies 7 € dir(P-) and 7 = 7-. 

2. The inclusion C is straightforward. For the reverse consider y € P-. There 
is a; e P with y < x. But then y £ x- £ ideP. 

3. Immediate from Q C Q- and monotonicity of ide. For a counterexample to 
the reverse inclusion see Example 13.1. 

□ 



12 Maximal and Infinite Ideals 

12.1 Maximal Ideals 

Frequently one is interested in processes that continue as long as possible. These 
are modeled by ideals which are maximal w.r.t. < or, equivalently, w.r.t. inclu- 
sion. We therefore give a characterisation of maximal ideals. For a behaviour B 
we denote the subset of maximal ideals by maxP; this agrees with the definition 
in Section 7.1, and hence all our laws in the Appendix apply. 
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Lemma 12.1. Suppose I € I{M) and N C M. Then 

1. X £ max/ ^ I = X-. 

2. max/ = 0 / infinite. 

3. maxA^ = 0 A / € maxideA^ =1^ max/ = 0. 

Proof. 1. {=>) We only need to show I C x-] the other inclusion follows from 
downward closure of I. Suppose y £ I. By directedness of / there is 2 : € / 
with X < z and y < z. Maximality of x implies z = x and hence y < x. 

(^) 

max/ 

= -{[by assumption ]} 

maxa;- 

= -([by Lemma 30.2.1 ]} 

(x-)-\(x-)^ 

= -{[by Lemma 30.1.2 ]} 

x-\x^ 

= -{[by Lemma 30.2.1 ]} 

max a; 

= -{[ irrefiexivity of < ]} 

{x} . 

2. Every non-empty finite set has a maximal element. 

3. Suppose max/ ^ 0, say x £ max/. By 1 then I = x- and by / € \deN we 
get X £ N . Since max// = 0, there isy £ N with x <y and y ^ x. But then 
y- £ \deN and hence, by Theorem 7.1.1, we have x- C y- A x- ^ y-. 
This is a contradiction to / € maxide//. 

□ 



12.2 Infinite Ideals 

Motivated by Lemma 12.1.2 we define, for a behaviour B, the set of its infinite 
ideals as 

inf B {I £B ■. max/ = 0} . 

For general domains, this is a bit of a misnomer, since there may well be infinite 
ideals with maximal elements. However, we will single out a particular class 
of domains where this cannot occur and work mostly with these, so that the 
terminology will be justified. Clearly, inf distributes through arbitrary union 
and intersection: 

i"f(U = U ’ (4) 

iei iei 

inf (Pi Bi) = P inf . 

iei iei 



( 5 ) 
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Now Lemma 12.1.3 can be restated as 

maxA^ = 0 maxideA^ C inf ideA^ . 

The reverse inclusion is generally not valid. For a counterexample choose M = 
IN U {oo} with the usual ordering and consider the ideal IN € I{M). We have 
maxE^ = 0, but IN ^ max7(M), since IN C M £ I{M) and IN M. 

We call a partial order {M, <) max-determined if 

inf/(M) C maxI(M) . 



12.3 Refinement Laws 

Now we clarify the relation between inf ide and maxide and investigate mono- 
tonicity and distributivity of the maxide, inf ide and max inf operations, which is 
important for refinement. First we note 

Lemma 12.2. For N,P C M, 

infideA^UP = infideA^ U infideP . 

In particular, inf ide is monotonic w.r.t. inclusion. 

Proof. Immediate from Lemma 11.2 and equation (4). □ 

Concerning maximal ideals we have 
Lemma 12.3. Let {M, <) be max- determined. Then, for N,P C M, 

1. infideA^ C ideA^ fl max7(M) C maxideA^. 

2. maxideA^ = infideA^ U idemaxA^. 

3. maxN = % =f- infideA^ = ideA^ fl max7(M) = maxideA^. 
f. infideA^UP = infideA^ U infideP. 

In particular, inf ide is monotonic w.r.t. inclusion. 

5. N = saf N inf ide {N fl P) = inf ide N fl inf ide P. 

6. maxN = 0 A A^ C P maxideA^ C maxideP. 

7. maxA^ = maxP = 0 max ide (A^ UP) = max ide A^ U maxideP. 

8. If N and P are cones with maxN = maxP = max (77 n P) = 0 then 
maxide(77nP) = maxide77 n maxideP. 

Proof. 1. 7 e inf ide A^ 

definition ]} 

7 e ide77 A max7 = 0 
=> since (M, <) is max-determined ]} 

7 e ide77 A 7 € max7(M) 

=> {[by Lemma 30.2.3, since ide 77 C I(M) ]} 



7 e max ide 77 . 
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2. ( C ) Suppose I e maxideA^. If max/ = 0, then / € infide// by definition. 
Otherwise max / is a singleton, say max / = {x}, and / = a;-. It follows that 
X £ N. For y £ N with x < y we have x- C y- £ ide N, so that x- = y- 
by maximality of / = a;-. Hence also x = y. This shows x £ max//, so that 
I = X- £ ide max//. 

(D) inf ide// C max ide// was shown in 1. Suppose now / £ ide max//, say 
I = X- with X £ max//, and / C J g ide//, say J = D- for D £ dir//. 
Consider y £ J. By directedness of J there is a 2 : € J with x,y < z. By 
J = D- there is & u £ D with z < u. Hence also x,y < u. By D C N and 
X £ max// we get x = u. So y < x and hence y £ x- = I. Altogether, J C / 
and hence J = I. So I £ max ide//. 

3. Assume max// = 0. Then by Lemma 12.1.3 maxide// C inf ide// and the 
equalities follow from 1. 

4. maxide// 

= -II by 3 ]} 

ide// n max7(M) 

C -{[by assumption N C P and monotonicity of ide ]} 
ideP n max7(M) 

C -II by 3 ]} 

max ide P . 

5. We aim at an application of Lemma 30.2.4. Assume 7 £ max ide // fl (ide P)*-. 
By 3 we have 7 g max7(M). But by 7 g (ideP)*- there is J g ideP with 
7 C J, a contradiction to maximality of 7. Hence max ide N n (ide P)*- =0. 
By symmetry, also max ide P fl (ide//)*- = 0. Now the claim is immediate 
from Lemma 30.2.4. 

6. ( C ) follows from 6. 

(D) Assume 7 g maxide// fl maxideP. Then by 3 we have 7 g max7(M). 
Hence, again by 3, we only need to show 7 g ide (NCiP). Since N and P are 
cones we get I C N and 7 C P and hence 7 C // n P as well, showing the 
claim. 

□ 

The next lemma allows simplification of the defining property of a behaviour. 
Lemma 12.4. Consider N,P C M. Then 

max ide (//UP) = max ide P ide // < ide P . 

Proof. (<=) 

ide// < ideP 

=> -{[by Lemma 30.3.4 [} 

max (ide // U ideP) = maxideP 
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{[by Lemma 11.2 ]} 
maxide(A^UP) = maxideP . 

(=^) If = 0, the claim holds trivially, since ide0 = 0. Hence we now assume 
N 

We now need the so-called Maximal Principle, a variant of the Axiom of Choice 
(see eg. [19]): Assume a partial order in which every non-empty chain has an 
upper bound. Then every element has a maximal element above it. 

We apply this to the partial order (ideA", C). It satisfies the assumption, since 
ideA is closed under directed unions and hence, in particular, under unions of 
chains. Consider now 7 € ideA C ide (A U P). By the maximal principle there 
is a J e max ide (A U P) = max ide P with I < J. □ 

Under additional assumptions we can simplify the assertion: 

Lemma 12.5. Assume P € dirM. Then 

max ide (A U P) = max ide P N < P . 

Proof. To apply Lemma 12.4 we show that P € dirM implies 

ide A < ide P A < P . 

{=>) Assume x £ N. Then x- £ ide A and so there is 7 € ideP, say 7 = D- for 
D £ dirP, with x- < I. By Lemma 30.3.1-2 we get x < P. 

(<^=) For 7 £ ide A we have 7 < P € dirP and hence, by Lemma 30.3.1, also 
7 < P- £ ide P. □ 

For a counterexample when P is not directed see Example 14.2 in connection 
with Corollary 12.6.2 below. 

Recalling the equivalence ~ associated with the pre-order < , we obtain from 
the previous two lemmas 

Corollary 12.6. Consider N,P C M. Then 

1. ideA~ideP max ide A = max ide P. 

2. If N,P £ dirM then 

N ^ P max ide A = max ide P . 



12.4 An Alternative Characterisation of Infinite Ideals 

We conclude this section by an alternative characterisation of the set inf ide P 
for property PCM. First we define 

lim P {7 e 7(M) : 7 n P G dir M A max (7 n P) = 0} . 

This generalises the corresponding definition for infinite words or streams in [55, 
60,61,65,66] (to cite just a few), which is based on [20]. Other notations for 
limP found in the literature are P^ or P. We can then show 
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Lemma 12.7. 1. infideP C limP. 

2. If (M, <) is max- determined then the reverse inelusion holds as well. 

Proof. We first note that 

I e inf ide P 
-{[ definition ]} 

/ e ideP A max/ = 0 
{[by Lemma 11.1 ]} 

I ={lnPp A maxi = 0 
{[ equality ]} 

I ={lnP)^ A max (/ n P)^ = 0 
{[ Lemma 30.2.2 ]} 

/= (/nP)^ A max(/nP) = 0 . (*) 

Now we prove our claims as follows: 

1. w 

=> {[by Lemma 30.5.3 [} 

/ n P e dir M A max (/ n P) = 0 
{[ definition [} 

/ e lim P . 

2. Let {M, <) be max-determined and assume / € limP. By (*) it remains to 
show / = (/ n P)-. First, by monotonicity of downward closure we have 
(/ n P)- C /- = /. Using Lemma 30.2.2 we obtain max(/ n P)- = 
max(/ n P) =0, so that by max-determinedness (/ fl P)- € maxI{M) and 
hence (/ n P)- = I. 

□ 



12.5 About max-Determinedness 

To investigate under which conditions a partial order is max-determined, we 
introduce some auxiliary notions. Let F : V{M) V{M) be some function, 
such as dir or ide. We say that N C M has F-maxima if every set in F{N) has 
a maximal element. In addition to the functions mentioned we shall use 

ne A =*' {P C A : P 0} , 
chaiA {P C A:P non-empty chain) . 

Lemma 12.8. //ACM has dna\-maxima, then it also has ne-maxima. 
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Proof. Assume % ^ D C N and rnaxD = 0. Construct a chain C C N as 
follows: Choose xq G D arbitrarily. Assume now that Xi has been found. Since 
Xi = maxD, there is Xi+\ € D with Xi < Xi+\. Now for C {xi : i € IN} 
we have maxC = 0, a contradiction. □ 



Corollary 12.9. If N C M has cha\-maxima, then it also has d\r-maxima. 



Proof. Every directed set is non-empty. 



□ 



We say that {M, <) separates ideals if for all I, J G I{M) with I ^ J the 
intersection 7 nJ has chai-maxima. The connection with max-determinedness is 
given by 



Theorem 12.10. (M,<) is max- determined iff(M,<) separates ideals. 



Proof. {=>) Suppose I ^ J and C € chai (7 n J), but maxC = 0. Then C- is 
an ideal with maxC- = 0. By max-determinedness then C- € maxideM. Since 
by downward closedness of 7 and J we have C- C 7 and C- C J it follows 
that 7 = C- = J, a contradiction. 

(<^=) Assume max 7 = 0 and 7 ^ maxideM. Then there is J 7 with 7 C J. 
Since (M, <) separates ideals, and by Corollary 12.9, then 7 = 7 n J has dir- 
maxima. In particular, max 7 0, a contradiction. □ 



This has the following surprising consequence: 



Corollary 12.11. Let (M, <) be max- determined. Then all elements of M are 
eompaet. 



Proof. By the previous theorem, (M, <) separates ideals. 

We now first show U7 C 7 for all 7 € 7(M). Assume y € U7 and set J y-. We 
have I C J. If I ^ J then 7 = 7 nJ has a maximal and hence, by directedness, 
greatest element 2 :. But then z = Lil = y so that J = 7, a contradiction. 
Consider now x £ M and 7 € 7(M) such that a; < U7 € 7. By downward 
closedness of 7 we get x £ I and x is compact. □ 



The reverse implication is not valid as the following example shows. Consider 
the partial order 
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4 



2 



0 




5 



3 



1 



in which all elements are compact. However, for I {0,2,4,...} we have 
max/ = 0 and I C J “= {0, 1, 2, 3, 4, . . .}, ie. I is not maximal. Concerning 
separation of ideals, / = / fl J doesn’t have a maximal element. 

It will be interesting to find further, more “manageable” characterisations of 
max-determinedness. 



Part III: A Particular Case: Streams 

We now specialize to a particular partial order. We shall represent streams 
using sets of finite traces. These are finite words over an alphabet A of atomic 
actions; they are ordered by the prefix relation. 

13 Streams and Properties 

Whenever we are working in the particular domain A°°, we rename ide into str 
to emphasise that fact. So the set of streams spanned by property P C A* is 

str P ideP . 

Note that it would not be adequate to work with the set str(P-), the so- 
called adherence of P (see eg. [49,60]), instead of strP. The reason is that by 
prefix-closure infinite substreams may “sneak” into a cone although it results 
from a language of mutually C -incomparable words which represent systems 
with finite behaviour only. 

def 

Example 13.1. The language L = 0* • 1 represents a behaviour with arbi- 
trarily long but finite sequences of Os terminated by the “explicit endmarker” 
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1. The words in L are mutually incomparable w.r.t. C . Hence all directed sub- 
sets of L are singletons and their downward closures are principal ideals and 
hence finite. So strL consists of finite ideals only. However, the prefix closure 
L- contains the infinite ideal 0* representing the infinite stream 0“ of Os. So 
strLE = StrL U {0“}. □ 

Using Konig’s Lemma one can show that for finite A every infinite cone 
contains an infinite stream. The general definition of ide omits these undesired 
streams. 

So, using ide we can distinguish between erratic and angelic non-determinacy 
and talk about fairness without resorting to metric and topological spaces as eg. 
in [4]. 

Example 13.2. Consider the recursive definition 

e = 0 o B 0 1 , 

where o denotes stream concatenation (see Section 15 for a precise definition) 
and [| denotes non-deterministic choice. In an angelic interpretation of [| always 
eventually the terminating branch 1 is chosen, and so B would equal strL of 
Example 13.1. 

In an erratic interpretation of Q , on the other hand, no guarantee is given 
that the terminating branch will ever be chosen, and so B would equal str L- of 
Example 13.1. □ 

We want to show now that str (and hence ide) does not distribute through 
general union: 

Example 13.3. Take U = 0*. Then U = [J 0*. However, strU = {0*} U 

*gin 

{(0*)- : i e IN}, whereas strO* = {(0*)- : i € IN). □ 

*gin 

14 Maximal and Infinite Streams 

As already mentioned, maximal ideals model processes that go on as long as 
possible. Eor streams we have a more pleasant situation than for general ideals: 

Lemma 14.1. (A*, U) « max- determined. 

Proof. Assume I € ide A* A max/ = 0 and consider J € ide A* with / C J. By 
Lemma 30.4 and downward closure of I, J it suffices to show J < I. Consider 
y £ J. Since max/ = 0 there is some x £ I C J with ||y|| < ||a;||, where ||u|| 
denotes the length of word u. Moreover, by directedness of J, there is 2 : € J with 
X Q z A y Q z. Erom linearity of 2 - it therefore follows that x Q y V y Q x. 
However, since ||y|| < ||a;||, we must have y Q x. □ 
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This allows us to use all laws from Section 12 for streams. At this point 
it is also convenient to give the counterexample to the simplified version of 
Corollary 12.6: 



Example 14.2. Set U = 0* • 1 and y = UUO* = by (3). Then [7 ~ V", 
but maxstrf/ ^ maxstrF, since 0* € (maxstr y)\(maxstr [/). □ 

Concerning infinite streams, we note that by Lemma 8.2 we have 

inf ideP = {I & \deP : I infinite} . 

To establish the relation with [60] we also show 

Lemma 14.3. For P C A* we have 

lim P = {I £ A°° : I n P infinite} . 



Proof. 7 e lim P 

}[ definition ]} 

/nPedirM A max (7 n P) =0 
}[by7nPC7 and Lemma 8.1 ]} 

7 n P 0 A max (7 n P) = 0 . 

We show now that, for linearly ordered L C A* , 

L infinite L ^ ^ A maxL = 0 . 

{=>) P 0 is immediate. Suppose x € maxL. By linearity then L C x-. But 
then |L| < ||a;|| + 1, a contradiction. 

(<^) Every non-empty finite set has a maximal element. □ 

To end this section, we write out specialisations of some of our laws for the 
case of streams, since they will be used in the bounded buffer example below: 

Corollary 14.4. 

infstr(A^UP) = inf str P <^= C P A P direeted 
infstrA^ = infstrP <^= N ^ P A N, P direeted 

Proof. Immediate from Lemma 14.1, Lemma 12.5 and Corollary 12.6. □ 



Example 14.5. Since (a • b)* • a) ^ {a • b)* and both languages are directed, 
we obtain (inf str (a • 6)* • a) = inf str (a • 6)*. □ 
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15 Stream Concatenation 

As a prerequisite for defining infinite repetition we need stream concatenation 
which, for streams S, T is defined by 

5oT =*' 5 U (max5) .T . 

Let us explain this definition. If S is finite then max 5 is a singleton. This part 
of the overall behaviour then is prefixed to all traces in T to represent the 
concatenated behaviour. If S is infinite then max 5 = 0 and hence, by strictness 
of o, we get S oT = 5, as is intuitively expected. We have 

max(5oT) = (max5) • (maxT) . 

It is straightforward to show that 5 o T is indeed a stream and that (A°°, o, e) is 
a monoid. As a shorthand notation we shall also allow words as first argument 
of o. This is made precise by setting 

uoT =*' u^oT = U u»T . 

Again, o is extended pointwise to behaviours and, in the case of the above 
shorthand, to languages. 

16 Infinite Repetition 

We now give the usual greatest fixpoint definition of the set of streams that 
result from infinite repetition of words from a language U C A*: 

= UoU^ A 

X = U oX ^ X c . 

According to the Knaster-Tarski fixpoint theorem this is well-defined by mono- 
tonicity of o. Note that by this definition 0“ = 0. However, if e € 17 then 
jju _ ^oo Pqj. reason, [/“ is usually considered only for s ^ U. 

It should be noted that for \U\ >2 and s ^ U there are nontrivial solutions 
of X = U o X properly less than [/“. As an example consider the behaviour 
° U eventually periodic streams. 

ueu 

To tie this in with the str-operation, we quote [60], p. 433: 
e^U ^ WmU* = [/“ U [/* o lim [7 , 
or, using Lemma 12.7 and max -determinedness, 

e^U ^ infstrf/* = U U* o infstrf/ . 

From this, by strictness of o it is immediate that 

e ^ A infstrf/ = 0 ^ [/“ = inf str [7* . (6) 



A sufficient condition to establish the premise is given by 
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Lemma 16.1. If U C A*\e satisfies the Fano eondition, ie. the words in U are 
mutually ineomparahle w.r.t. C, then 

= infstrt/* . 

Proof. By the Fano condition, all directed subsets of U are singletons. Hence 
strU = {u- : u £ U} consists of finite streams only. □ 

Note that if e € 17 then U satisfies the Fano condition iS U = e; for this 
case the above equation doesn’t hold, since then infstrf/* = 0. It should also 
be mentioned that U satisfies the Fano condition iff [7 = maxU. To see what 
happens if the Fano condition is not satisfied, consider 

Example 16.2. Let A = {a,b} and U '^= {a • 6" : n € IN} C A*. Then U £ 
dir U*, since U C U* and U is directed. Hence U- = eUU £ str U* and, since Il- 
ls infinite, even U- £ inf strf/*. Now, U- represents an a followed by infinitely 
many 6s; but this behaviour clearly does not arise from repeated concatenation 
of words in U. It is “sneaked in” by the fact that simply considering directed 
subsets of U* throws away too much structural information. □ 

To allow a characterisation of for languages that do not satisfy the Fano 
condition, one can artificially enforce it by attaching a special endmarker to all 
words in U and remove it after singling out the infinite streams. Let ff ^ A he 
a new letter and consider streams over the extended alphabet AUff. Moreover, 
denote hy A <\u the word that results from u by removing all occurrences of # 
and extend the operation A<\ pointwise to languages and behaviours. Then we 
have 

Lemma 16.3. For U C A*\e, 

=*' A < inf str ([/.#)* . 

For the somewhat tedious proof see [43] . 

The streams in str {U •if)* correspond to finite and infinite sequences that 
result from concatenating arbitrary elements of U with the separator ff in be- 
tween. The operation inf then selects the infinite ones of these; ii s ^ U these are 
precisely the infinite words resulting from repeatedly concatenating words from 
U. The separators are used to record the “construction history” of the streams; 
they are finally thrown away again by the filter A<i. In this way subsets of U* 
which are directed “by accident” are ignored. A similar mechanism for defining 
iteration is employed in [50] in the finite case and in [11] in the infinite case. 

17 Streams of Functions 

We have made no assumptions about our alphabet A. Hence it may even be a 
set of functions. Then streams over A model components with time-dependent 




92 



B. Moller 



behaviour. We have seen examples of this in the description of various faulty 
channels in Section 5. 

A stream S € A°° of arguments is fed into a stream F £ {A A*)°° of 
functions by the operator ». The images /(a) of the elements a of A under the 
individual functions / in F are concatenated into an overall output stream. A 
wordwise definition of this is 

def 

s » w = s , 

def 

s» s = s , 

a* s » f •w /(a) • (s » w) . 

This operation is extended pointwise to languages and behaviours. 

Example 17.1. For finite stream S we have 

(S • T) » cchan^ = (5 » arbchan) • (T » cchanA) . 

This reflects the unbounded fairness of cchan^'. we have no guarantee when cor- 
rect transmission occurs, and hence the elements of S may or may not be trans- 
mitted correctly. □ 

With bound assumptions one gets more precise information: 

Example 17.2. We have 

m> k (a™ • T) » cchan<k G A-* • a • A°° . 

A channel with fairness bound k must transmit a correctly at least once if it 
receives more than k copies of a. □ 

18 Feedback and State-Based Systems 

18.1 The Feedback Operation 

An essential operation on SPFs is feedback of some outputs to the inputs. Assume 
an SPF F : A°° X B°° A°° x C°° . Then its feedback feedF : B°° ^ C°° is 

given by 

{feedF){S) = T where (Z,T) =F{Z,S) . 

It may be depicted as 



S 



Z 



F 



T 



The semantics of this recursive declaration is the usual least-fixpoint one. 
This version of the feedback operator hides the feedback stream. If this is to 
made visible one simply copies it and feeds one copy back whereas the other is 
transmitted to the outside. 
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18.2 State-Based Systems and Automata 

This operation together with streams of functions allows a very convenient and 
concise description of state-based systems. 

Assume a set Q of states, an input alphabet A and an output alphabet B. 
Then a time-dependent automaton is given by a stream H £ {Qx A Qx B)°°. 

We may now feed this automaton with a starting state qo £ Q and a stream 
S £ A°° of input values to produce a stream of output values in B°° . The stream 
of states entered during the processing of the input is constructed by a feedback 
and hidden from the outside. This is described by 

auto(H, qo,S) = T where (Z, T) = (go • Z, S) » H . 

By placing various restrictions on the entities involved, we can distinguish a 
hierarchy of automata: 

— If no further restrictions are made, we obtain a timed and state-dependent 
automaton. 

— If we require |(3| = 1 then we have a timed and state-independent automaton. 

— If we take H = for some f : Q x A Q x B, we obtain a timeless and 
state-dependent automaton. 

— If we again take H = but also require |(3| = 1, we have a timeless and 
state-independent automaton. 

For example, an easy proof by induction over the structure of the finite words 
shows 

Lemma 18.1. If \Q\ = 1 then 

autoif^ ,qo,S) = S » , 



where g{i) = TT^UiqoA))- 

We now illustrate the general case by the following 

Example 18.2. We give a description of a one-place asynchronous buffer. The 
example is taken from [10]. Consider a set D of data. The input alphabet is 

A =*' D U {!} . 

An input d £ D means that d is to be stored in the buffer, whereas ! means a 
request for the current contents of the buffer. 

At each time point the buffer may accept or reject its input which is shown 
by a Boolean value. In addition to that the buffer will output data if it accepts 
the request signal. So we choose the output alphabet 

B =*' {D U {e}) X IB , 

where e models the case of no proper output. 
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As the set of states we choose 

Q = D[J {e} 

where e models the state of being empty whereas d £ D models the state of 
containing value d. 

Now we define two transition functions 

acc, rej : Q X A ^ Q X B 

which model acceptance and rejection of the input. We have 

acc{q,d) = (if g = e then delse g, (e, g = e)) , 
acc(g, !) = (e,(g,g ^ e)) , 
rej{q,x) = (g, (e, false)) . 

The behaviour of a fair buffer, ie. one which rejects inputs only finitely many 
times before eventually accepting one is the specified as 

auto((rej* • acc)^,e) . 

In particular, we can avoid the use of prophecy variables (see eg. [10]) in this 
style. □ 



19 Processes and Synchronised Parallel Composition 



While the previous two sections are appropriate for the SPF view of distributed 
systems, we now define operators that are adequate for the trace view (cf. Sec- 
tion 6) . The particular definitions given here draw strongly on the corresponding 
ones in [25]. 

Assume an overall alphabet A for our streams. A process is a pair (B,B) 
where B C A is the alphabet of the process and B C B°° is a behaviour. We 
set 

a{B,B)=^B, I3{B,B)=^B. 

An auxiliary operation is the projection f of words to an alphabet B C A. 
It is defined inductively as follows: 



ejB 
(a* s) j B 



def 

= e 

def f a • {s j B) if a £ B 
( s t ^ otherwise. 



Projection is extended pointwise to languages and behaviours. The projection 
of a stream is a stream again. 

Using projection we can characterise processes in another way: the pair {B, B) 
is a process i&\/ S £ B ■. S \ B = S . 

We need to lift the notion of refinement to processes. We allow that a pro- 
cess is refined by another one that has additional “internal” actions. Since then 
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refinement amounts to inclusion of (the projection of) the behaviour, we abuse 
notation and write again C for the refinement relation: 

P C Q aP D aQ A (pP) ^ aQ C /3Q . 

In this case we say that P refines Q. It is easily checked that C is a partial order 
on processes. 

If behaviours are “loose enough” in that they allow arbitrary actions in be- 
tween the “interesting” ones, one can model synchronised parallel composition 
very simply by intersection (see eg. [26]). For general behaviours this works well 
only if they are “loosened” by interspersing arbitrary actions between the proper 
ones; this is again taken from [25]. The intersection then allows only traces in 
which the actions interesting to both partners occur in a sequence that is ac- 
ceptable to both partners (ie. allowed in both behaviours) whereas the private 
actions of each partner are not constrained by the other partner. 

Hence, for processes P and Q, we define the parallel composition P\\Q by 

a{P\\Q) = aPU aQ , 

S£I3{P\\Q) "^4 S = 5ta(P||Q)A 
SfaP G I3PA 
S ] aQ e (}Q . 

Note, in particular, that jj is commutative, associative and idempotent then. 
Moreover, 

P C Q ^ P\\Q = P . 

If aP = aQ then l3(P\\Q) = /3P n /3Q. 

This parallel composition operator will be used in our extended example in 
Section 22.3. 



Part IV: Safety and Liveness 

We have already informally discussed safety and liveness (see eg. [33, 1,21]). 
We want to show how these notions can be expressed algebraically. In [1] and 
subsequent papers, a property is a set of infinite sequences of states. The appro- 
priate counterpart in our setting is therefore a set of streams, more generally, 
ideals, ie. a behaviour. 

20 Safety 

20.1 Definition and Topological Properties 

In [1] a behaviour B C over infinite streams is called safe if the following 
holds: 



\f S :S^B ^ {3 s £ S -.y T £ A^ : soT ^B) . 
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This means that for every stream not in the behaviour there is a decisive finite 
prefix s where something went “irreparably wrong” in that no continuation of s 
can bring the computation back to the “good path” . 

We want to simplify the formal definition above by moving from logic to 
algebra. First, using contraposition, the formula can be transformed to 

V5gA“’:(VsG5:3TgA“’:soTgK) ^ S gB . 

Now, recalling the definition pref K = UB from Section 10, we have 

3TgA^: soTgB^sG pref e . (7) 

Hence the safety condition reduces to 

V 5 G : (V s G 5 : s G pref K) ^ SgB 

-{[ set theory ]} 

y S -.S C prefB ^ SgB 
^ {[by prefix-closedness of pref K and Lemma 11.4.1 ]} 

V 5 G : 5 G strprefK ^ SgB 

-{[ defining should K strprefK]} 
should e C B . 

This simplified form involves only order-theoretic notions and hence generalises 
easily to arbitrary ideal completions. Consider a partial order (M, <) and a 
behaviour B C M°°. Then we call B safe iff should K C B, where 

should K ideprefS . 

By C -monotonicity of pref and ide also should is C -monotonic. Note that 
for all B C M°° we have B C should K. So a behaviour B C M°° is safe iff 
B = should K. Moreover, 



Lemma 20.1. 1. Safe behaviours are elosed under arbitrary interseetions and 

finite unions. 

2. should is idempotent. 

3. should K is the least safe behaviour eontaining B. 



Proof. 1. Assume a family (Bj)j^j of safe behaviours. Then for all j £ J we 
have by monotonicity of should and safety of Bj that 

should (Pi Bj) C should Kj C Bj , 

jeJ 



so that 

should (p Bj) c f]Bj, 
jeJ jeJ 

ie. P Bj is safe again. 
jeJ 

For union we calculate, for I G M°°, 
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I e should {B U C) 

{[ definition and distributivity of pref ]} 

I C pref eu pref C 
-{[ Boolean algebra ]} 

/ = (/ n pref B)U {I n pref C) . 

Set now Iis / fl prefB and Ic / n prefC. Since I is directed, by 
Lemma 30.5.2 we have Iq < Ic or Iq < Ib- So by downward closedness 
of Ib and Ic and Lemma 30.3.3 we have Ib Q Ic or Ic C Iq and hence 
I = Ib or I = Ic- But then, by Boolean algebra and the definition, we get 
I e should B \/ I G should C, so that by safety of B and C also I G BUC. 

2. For I e M°° we have 

I e should should B 
I definitions ]} 

I Q U{-^ G : J C prefB} 

using the principal ideals J = x- iov x & pref B ]} 

I C prefB 

definitions ]} 

I e should B . 

3. Let C be safe with B C C. Then 

should B 

C {[ monotonicity ]} 

should C 

= {[ safety of C ]} 

C . 

But should B is safe by 2. 

□ 

By these properties, the safe behaviours coincide with the closed sets of a 
topology on M°° (cf. eg. [59]) and should is the topological closure operator. 



20.2 Safety and Snapshot sets 

Let us now study how safety is reflected in snapshot sets. In other words, we 
want to know when for P C M the behaviour ideP is safe. We calculate, for 
I e M°°, 
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I e should ideP 

{[ definition of should ]} 

I e idepref ideP 

{[by Lemma 11.4.2 ]} 

I G ide(P^) 

{[by Lemma 11.4.1 [} 

I C 

and hence 

ide P is safe 

{[by the above [} 

V 7 G M°° ■. I C P^ ^ 7 G ideP 
^ {[VuGP:u^CP^[} 

V u G P- : u- G ideP 

=> {[ since for 77 G dir P we have D- = u- u G 77 [} 

P^ C P . 

On the other hand, 

P< C P ^ V 7 G M°° ■. I C P^ ^ I C P . 

Altogether we have shown 

Lemma 20.2. The behaviour ideP is safe iff P- C P, ie. iff P is downward 
elosed. 

For that reason we call a snapshot set P C M a safety property iff it is 
downward closed. We have 

Corollary 20.3. If I £ M°° and P is a safety property, then 

7 G ide P 7 C P . 

Proof. Immediate from Lemma 11.4.1. □ 

For a safety property P the behaviour ideP is closed under unions (ie. 
suprema) of C -ascending chains of streams. In the special case of streams, safety 
properties are simply prefix-closed subsets of A* . 
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21 Continual Satisfaction 

21.1 The General Case 

In connection with safety issues one is interested in the set of all objects that 
satisfy a property also in all their finite approximations. Given a property P C 
M we define the property saf P by 

saf P =*' {xGM -.x^ Q P} ■ 

The set saf P has also been termed the prefix kernel of P in [50,67]. We have 

Lemma 21.1. 1. saf P C P. 

2. saf P = P iff P is a safety property. In partieular, saf P- = P- . 

3. saf P is the greatest safety property eontained in P. 

4- saf is monotonie and striet w.r.t. 0. 

5. saf (P n Q) = safP n safQ. 

6. I £ idesafP I C P. 

Proof. 1. X £ saf P 

{[ definition ]} 

X- C P 

=> {[ a; e a;- ]} 

X £ P . 

2 . (^) 

X £ P 

^ -{[ assumption ]} 

a; e saf P 

j[ definition ]} 

X- C P . 

(^) 

X £ P 

^ I assumption ]} 

x^ C P 

-{[ definition ]} 
a; e saf P 

so P C saf P; the reverse inclusion was shown in 1. 

3. It is obvious that saf P is a safety property. Let Q C P be a safety property 
and X £ Q. By definition then x- C Q C P and hence x £ saf P. 
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4. Immediate from the definition. 

5. X G saf {P n Q) 

-{[ definition ]} 

C PnQ 

-{[ infimum property of intersection ]} 
x^ C P A x^ C Q 
-{[ definition ]} 

a; e saf P A a; e saf Q . 

6. / e idesafP 

-{[by Lemma 11.1 ]} 

7 C (7 n safP)^ 

-{[ since saf P C P ]} 

7 C safP^ 

-{[by downward closedness of saf P ]} 
7 C safP 

-{[by downward closedness of 7 ]} 



Note that saf does not distribute through union. We can now state further 
distributivity properties for ide: 

Lemma 21.2. Consider N,P C M. Then 

1. 77 = saf 77 ide (77 n P) = ide77nideP. 

2. N = saf 77 inf ide (77 n P) = inf ide 77 n inf ide P. 

3. ideQ n ideP C ide((3- n P-). 

Proof. 1. We only need to show ( 3 ), since the reverse inclusion follows from 
monotonicity of ide. 

Assume 5 € ide 77 n ide P, say S = D- = E- with P e dir 77 A P € dir P. 
By Lemma 30.4 then E < D, and by Lemma 30.3.2 we get E < N, since 
P C 77. Now 77 = 77- shows P C 77. Since P C P we get P C 77 n P 
and, since P is directed, even P € dir (77 n P). This shows that S = E- 
and hence S € ide (77 n P). 

2. immediate from 2 and equation (5). 

3. Immediate from 1, Lemma 21.1.2 and monotonicity of ide. 

□ 
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21.2 Deriving a Recursion for saf 

Next, for the particular case of streams we want to derive a grammar-like or 
automaton-like representation for safety properties of the form saf P for some 
P C A*. We use induction on the words involved. For the induction base we 
calculate 

e e saf P 

{[ definition ]} 

eE C P 

{[ e- = e ]} 

e G P . 

For the induction step, we have, for arbitrary c £ A, 
c • s G saf P 
-{[ definition ]} 

(c • s)- C P 
^ {[by (3) ]} 

c- U c • s- C P 
{[ set theory ]} 
c- C P A c» s- CP 
{[c-=eUc]} 

£GPAcGPAc«s- CP. 

We assume now that P itself is already given in the form of an automaton-like 
recursion. Then there is a systematic way for passing from that to a recursion 
for saf P. Suppose that P satisfies, for all c G A and U C A* , 

c»U C P ^ U C F^{P) (8) 

for some function F : A ^ {V{A*) V{A*)). In other words, we assume that 

the “recursive call” Fc{P) depends only on the first symbol of the word to be 
analyzed. Note that this assumption means a Galois connection between c« and 
Fc- 

Under this assumption we can continue as follows: 
c • s- C P 

{[by assumption (8) ]} 
sE c P,(P) 

{[ definition ]} 
s G saf Fc(P) . 
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Note that a bi-implication linear in s results. To sum up, we have shown 
Lemma 21.3. Suppose property P € V(A*) satisfies 

c»U C P ^ U C F^{P) . 



Then, for U %, 



e e saf P e e P , 

c»U CsafP<t^eePAcePA[/ Csaf Fc{P) ■ 

Assume now that we are given two properties P and Q and seek a recur- 
sion for saf P n saf Q = saf (P n Q). The following result is immediate from 
Lemma 21.1.5 and Lemma 21.3: 

Lemma 21.4. Suppose P,Q satisfy 

{c»U C P ^ U C Pc(P)) A {c»U C Q ^ U C Ge(P)) . 

Then, for U %, 

e e saf (P n Q) e G P n Q , 

c*P C saf (PnQ)<S4>c^ C Png A P C saf (Pe(P) n Ge(Q)) . 

This corresponds to the construction of a product automaton. 

22 Liveness 

22.1 Definition and Topological Properties 

Following again [1] we call a behaviour B over streams live iff 
VsGA*:3TgA“:soTgP. 

Using again (7) we can reduce this to 

V s G A* : s G pref P 



and hence to 

A* C prefP . 

Since A* is the set of compact elements of A°° we can again easily generalise this 
to arbitrary ideal completions. Consider a partial order (M, <) and a behaviour 
B C M°°. Then B is called live iff 

M C prefP . 



We show now (see again [1]) 

Lemma 22.1. B is live iff it is topologieally dense in M°°, ie. should P = 
M°°. 
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Proof. M C prefK 

<1^ {[ for (=^) use transitivity of inclusion, 

for (<l=) the principal ideals J = x- iov x & pref B ]} 

V J G : J C prefK 

{[by Corollary 20.3 and the definition of should ]} 

V J G : J G should e 

{[ set theory ]} 

C should e 
{[ set theory ]} 

= should e . 

□ 

Now we obtain 

Lemma 22.2. Every behaviour is the interseetion of a live and a safe behaviour. 

Proof. We could copy the proof of the respective theorem in [1] verbatim, since 
it proceeds purely in topological terms. However, we give a simpler proof that 
avoids most of the topological reasoning in [1]. 

Assume B C M°°. We have 

B 

= {[ since B C should K ]} 

should K\(should B\B) 

= {[ definition of \, where C denotes the complement 

of C w.r.t. M°° ]} 

should K n should K n B 

= {[ de Morgan and double complement ]} 

should B n (should B U B) . 

Since should K is safe, the claim is shown if should K U K is live. We calculate 
should (should K U B) 

D {[ since C -monotonic and hence superdistributive over U ]} 
should (should K) U should K 
D {[ since should is extensive ]} 
should K U should K 
= {[ definition of complement [} 



M°° 
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SO that we are done by Lemma 22.1. □ 

Inspection of the proof leads to the following abstraction. Consider a Boolean 
algebra {K, <) with greatest element T. Call a function f : K ^ K a pre-closure 
if it is extensive, ie. satisfies x : x < f{x), and monotonic. Next, say that y £ K 
is f -dense ii f{y) = T. Then we have 

Corollary 22.3. Every element x of K is the meet of an f -image and an f- 
dense element, viz. 

X = fix) n (fix) U x) . 

Another way of replacing the topological proof of Lemma 22.2 in [1] by a 
proof over Boolean algebras is presented in [24]. However, our proof is simpler 
still. 

22.2 Liveness and Snapshot Sets 

As in the case of safety, we now investigate when a property P spans a live 
behaviour. We calculate 

ide P is live 

-{[ definition ]} 

M C prefideP 

{[by Lemma 11.4.2 ]} 

M C P^ 

{[ definition of < ]} 

M <P . 

Hence we call P C M a liveness property iff M < P. 

22.3 Spanning Infinite Behavionrs by Snapshot Sets 

We now define that part of a snapshot set that is relevant for the infinite streams. 
We call a set Q C M lively iff Q 0 A maxQ = 0. 

Lemma 22.4. 1. If Q is lively and x £ Q then there is an I £ inf ide Q with 
X £ I. 

2. If Q is lively then inf ide Q ^ 0. 

3. If M itself is lively then for every B we have M°° C B iff ml M°° C infB. 

Proof. 1. We construct a chain (a;j)jgiN as follows. Choose xq x. Assume 
now that Xi has been chosen. Since Xi ^ maxQ = 0, there is an Xi+\ £ Q 
with Xi < Xi+i . 

By construction then K {xi : i £ IN) € dirQ and hence I K- £ 
ideQ. Moreover, max / = max/C = 0, i.e, / £ inf ide Q. 
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2. Immediate from 1. 

3. Immediate from 1. 

□ 

In connection with the results below, property 3. will allow easier liveness 
proofs. Note that this is particularly relevant for the case of streams, since the 
set A* of compact elements itself is lively. 

Now we define the live part of P C M as 

livP =*' IJPp 



where 

£p {Q C P:Q lively} . 

This operation enjoys the following properties: 

Lemma 22.5. 1. livP C P. 

2. maxlivP = 0. 

3. P is lively iff P 0 A P = livP. 

4- liv is C -monotonic. 

5. liv liv P = liv P. 

6. livP 0 infideP 0. 

7. £p ^ infideP ^ 0. 

8. infideP = infidelivP. 

9. pref infideP = (livP)-. 

Proof. 1. Clear from the definition. 

2. Assume x € max livP C livP. Then there is a Q € £p with x £ Q. Since 
X = maxQ, there is y £ Q CP with x < y. Contradiction! 

3. The implication (=^) is clear. 

For the converse we use that maxP = max livP = 0 by 2. 

4. We have P C Q £p C £q [JPp C U'^Q- 

5. By 1 and 4 we have liv livP C livP. For the converse we calculate 

Q C P A Q lively 
=> -{[ definition of liv P ]} 

Q C livP A Q lively 
=> -{[ definition of P ]} 

Pp C PlivP 

=> -{[ monotonicity of [J and definition of liv ]} 

livP C liv livP . 

6. By 6 we have max livP = 0. Now from Lemma 22.4 and using C -monoton- 
icity of the inf ide-operation we get 0 ^ infidelivP C infideP. 

7. First note that 0 ^ Pp. But then [J'^p 0iffPp ^ 0. Now apply 6. 
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8. By 1 and C -monotonicity of inf ide we get D . Assume, conversely, I € 
inf ideP. Then there is a € dirP with I = D-. Since D is directed, we 
have D ^ Moreover, maxP = max/ = 0. So P € Pp and hence also 
D C liv P, ie. P e dir liv P. So / e inf str liv P as well. 

9. prefinfideP 

= -{[ definitions ]} 

U{P- : P e dirP A maxP = 0} 

= -{[ distributivity ]} 

(U{P : P e dirP A maxP = 0})^ 

C -{[ definition of P ]} 

(U^p)^ 

= -{[ definition of liv ]} 

(livP)^ . 

Assume conversely y € livP-. There is a Q € Pp and an x £ Q with y < x. 
By 22.4.1 there is an / € inf ide Q C inf ide P with x £ I. 

□ 

So in particular, liv is again a kernel operator. Moreover, by 7, to show that 
a snapshot set P spans infinite ideals, it suffices to exhibit a lively Q C P. Such 
Qs can frequently be constructed by induction. 



Part V: Extended Example: Buffers and Queues 

23 Specification of a Bounded Buffer 

As an example of the use of our constructs, we give a specification of bounded 
buffer and queue modules. For this we use the particular domain A°° = A* UA^ 
of finite and infinite streams over a set A of atomic actions. This example uses 
the trace view (cf. Section 6) of streams. It was motivated by the asynchronous 
bounded queue implementation in the collection of the IFIP WG10.5 benchmark 
problems in hardware verification [28] . 

The buffer module has one input and one output port. In describing such 
modules, we choose the letters a for the action of inputting and b for outputting 
and set A = {a,b}. Boundedness of a module can be enforced by requiring 
the number of input actions to exceed the number of output actions by at most 
some n e IN which then is the capacity of the device. 

We denote by Sc the number of occurrences of c € A in s € A* . Formally, 

def „ 

£c = 0 , 

(O- • S)c — d(xc “1“ •) 
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where <5 is the Kronecker symbol defined by 

^ def J 1 if a; = y , 

I 0 otherwise . 

Generalising the above informal description slightly, we define, for n € 2 
and a,b G A, the set 



EXf {s g A* : + n} 

of snapshots. Then s € EX“^ may be pronounced as “a exceeds b by at most n in 
s”. The specification is, however, very loose in that the balance between as and 
bs might be struck only at the very end of a word. Eor instance, € EX"^. 

So the restriction may be violated in prefixes and only established in the end. 
Eor bounded devices, this is not possible. They need a stronger specification. 
Therefore we strengthen our snapshot set to the safety property 

B“'’ =*' safEXf . 

Now ideB“^ is the set of all finite and infinite streams that satisfy EX“^ in all 
prefixes. However, we are interested in devices that work for an unbounded time. 
This is specified by considering as overall behaviour of such a device the set 

Bf =*' infstrB“'’ 

consisting only of infinite admissible streams. 

A buffer is a device in which the number of outputs must not exceed the 
number of inputs. Hence we define 

<kf 

Note the reversal of the arguments in the superscript. The finitary property 
Bq“ spells out to Sf, < Sa, as required. This describes an unbounded buffer. A 
bounded buffer of capacity n then is described by 

d^f ^jrab p j^ab 

This specifies the set of all infinite streams for which, in all finite prefixes, the 
number of outputs does not exceed the number of inputs and the number of 
inputs does not exceed the number of outputs by more than n. 

24 Transformation to Automaton Form 

We consider now again the family of properties EX“^. Erom its predicative defi- 
nition in the previous section we want to calculate a more “operational” descrip- 
tion corresponding to a generating grammar or accepting automaton. This can 
be done by a simple unfold/fold transformation using induction on the words in 
A* . Eor the induction basis we calculate 
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£ G EXf 

{[ definition of EX ]} 

Ea <Sb + n 

{[ definition of count ]} 

0 < 0 + n 

-{[ arithmetic ]} 

0 < n . 

Eor the induction step, we consider an arbitrary c & A: 
c . s G EXf 

-{[ definition of EX ]} 

(c»s)a <(c»s)b+n 
{[ definition of count ]} 

^ca + Sa ^ Scb + St, + n 

{[ arithmetic ]} 

Sa ^ St, Tl ^ca 

{[ definition of EX ]} 

Note that the recursion relations are linear bi-implications. Therefore we obtain, 
for [7 0, 

e G EXf 0 < n , 
c.U C EXf ^[7 C . 

This corresponds to an infinite grammar with nonterminals EX“^ or an infinite 
automaton with states EX“^ (n G 2). 

25 Counting Resumed 

Next we want a similar representation for 

B“'’ =*' safEXf . 

This can be done quite systematically using Lemma 21.3. We obtain, for U ^ 
e G B®'’ 0 < n , 

c»U C B“^ <t^0<nA0<riA[7 C B“^ for c G ^\{a, b} , 
a • [7 C B“^ <t^0<nA0<n — 1A[7 C B“^j , 

6 • [7 C B“^ <t^0<riA0<n + lA[7 C B“^j . 
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which simplifies to 

e G B®'’ 0 < n , 

c»U C ^0 < n A U C B“^ for c G ^\{a, b} , 
a • [/ C B“^ <t^0<n — 1A[/ C B“^j , 
b»U C B“'’<t^0<nA[/ C B“^_^i . 

In particular, B“^ = 0 for n < 0. 

Now we consider the bounded buffer behaviour. We calculate: 

= {[ definition ]} 

BT^^ n B®'’ 

= -{[ definition ]} 

Bg® n B®'’ 

= {[ definition ]} 

infstrBo® fl infstrB®^ 

= {[by Lemma 12.3.5, since the B sets have been specified as safety 
properties ]} 

infstr(Bg® n B®'’) . 

So the problem has been reduced to finding an explicit representation for Bq® n 
B®^, which is a simple product automaton construction. It is a special case of 
the automaton for 

G def -Tiba T>cib 
mn — ^rn ^ ‘ 

26 Decomposition 

Let us now define a buffer process by setting 

Then using parallel composition we can state the following nice decomposition 
properties: 

Lemma 26.1. 1. D EX^^ C EX-+„. 

B^J^ n B[{® C 

5. 5 1 {a, b} G BBt A 5 1 {6, c) G BBi^ ^ 5 f {a, c) G 
4 . II BB^ C BB^_^^. 

Proof. 1. s G EX°;^ n EX^^ 

{[ definition ]} 
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Sa < Sf, + m /\ Sf, < Sc + n 

=> -{[ transitivity and monotonicity ]} 

Sa < Sc + m + n 

^ -{[ definition ]} 

. s G EX^%„ . 

2. immediate from 1., Lemma 21.1.5 and Lemma 21.1.4. 

3. immediate from 2. 

4. immediate from 3. 

□ 

This allows decomposing a buffer of capacity n into a parallel composition 
of n buffers of capacity 1. Of course, it needs to be shown that the intersec- 
tions/parallel compositions are non-empty. This follows from our results in Sec- 
tion 22: first, it is easy to show that 

(a • h)* C EXf <S= n > 0 , 
inf str (a»b)* C <= n >1 . 

From this we get 

inf str (a • 6 • c)* C 1 1 BB^'^) <^= to, n > 1 . 

Since (a • 6 • c)* is lively, inf str (a • 6 • c)* and hence BBJ^ 1 1 BB^'^ are non-empty. 



27 The One-Place Buffer 



For the special case of n = 1 we have 

BBf = infstrGoi , 

where, for U 0, 

s G Goi TRUE , s G Gio TRUE , 

Cl 9 U C Goi U C Gio , u • U C Gio FALSE , 

U C Goi FALSE , 6 • U C Gio U C Goi . 



This corresponds to a two-state accepting automaton for the bounded buffer 
property, which is sufficient for purposes of implementation. 

However, the above can also be seen as a regular grammar or system of 
equations for languages. We can calculate from it a regular expression for Goi 
using twice 



(Arden’s Rule) 



e ^ U X = V \J U »X 

X = u* »v . 



This gives 



Goi = {a • b)* • {s U a) . 




Ideal Stream Algebra 111 



Using Example 14.5 and Corollary 14.4, we obtain 

BBf = infstr(a.6)* . 

Finally we use the fact that the language a • 6 as a singleton trivially satisfies 
the Fano condition, so that Lemma 16.1 gives 

BBf = {a . bf , 

as expected. 



28 From Buffers to Queues 



So far we have only talked about the relative order of input and output events. 
For queues also the relative order of input and output values is relevant. We use 
now the refined alphabet A = C xV where C is the set of channel names and 
V the set of values. An element of A will be denoted as c{v). As a shorthand we 
introduce 

c = {c{x) : a; e U} . (9) 

For a word w € A* we define the word ehans{w) of channels on which activity 
occurred and for each c £ C the word valsdW) of values transmitted along c. 
Their inductive definitions read 

ehans(e) = e , 

ehans(b{x) • w) = b»ehans(w) , 



valsc(e) 

valsc(b{x) • w) 



£ , 

J X • valsc(w) if b = c , 

\ valsc{w) otherwise . 



These operations are extended pointwise to languages and behaviours. 

With these operations we may specify the behaviour of a faithful component, 
ie. a component which does not re-order or lose messages when transmitting from 
channel a to channel b, as 



d^f |5- . = valsdS)} . 



A bounded queue is then specified as a faithful bounded buffer: 

BQf =*' n BBf . 

Here a,b in BBf are to be understood according to abbreviation (9). 

The decomposition properties for buffers carry over to queues, so that again 
a queue of capacity n can be refined into the parallel composition of n queues of 
capacity 1. Moreover, a similar calculation as before, using again Arden’s rule, 
yields for the refinement 

BQf = ( U a{x) . b{x)r ■ 
xev 
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29 Conclusion 

We have introduced some algebraic operators and laws that can be used in 
the specification and derivation of systems. By abstracting from the domain of 
streams for which most of the notions were coined originally, we have obtained 
a rich set of laws which hold for a variety of domains. The order-theoretic ap- 
proach lends itself well to an algebraic treatment. The point-free formulation 
eases and compacts specifications, proofs of the basic properties and the actual 
derivations. Further research along these lines should search for similar algebraic 
characterisations of other important notions about systems and to explore their 
algebraic properties. 

Concerning the underlying theory, our domain-theoretic notions should be 
tied in more closely with the topological view (see eg. [55,59]). Moreover, in the 
stream domain there obviously is a close connection with temporal operators: 
str P is related to intermittent assertions [15] and hence the formula DOP (always 
eventually P) in temporal logic, while saf P corresponds to |j]P (P holds in all 
initial subintervals [48]). These connections are made precise and carried over to 
arbitrary domains in [44,45]. The resulting “modal algebra” as well as the ideal 
and stream algebra developed in the present paper need to be tried out in larger 
and more realistic case studies of deductive design of parallel systems. 

Acknowledgement: Many helpful remarks on this paper were provided by J. 
Baeten, A. Ponse and V. Stoltenberg-Hansen. 



References 

1. B. Alpern, F.B. Schneider: Defining liveness. Information Processing Letters 21, 
181-185 (1985) 

2. R.-J.R. Back: A calculus of refinements for program derivations. Acta Informatica 
25, 593-624 (1988) 

3. J.C.M. Baeten, W.P. Wijland: Process algebra. Cambridge Tracts in Theoretical 
Computer Science 18. Cambridge: Cambridge University Press 1990 

4. J.W. de Bakker, J.I. Zucker: Compactness in semantics for merge and fair merge. 
In: E. Clarke and D. Kozen (eds.): Logics of Programs. Lecture Notes in Computer 
Science 164. Berlin: Springer 1983, 18-33 

5. F.L. Bauer, B. Moller, H. Partsch, P. Pepper: Formal program construction by 
transformations — Computer-aided, Intuition-guided Programming. IEEE Trans- 
actions on Software Engineering 15, 165-180 (1989) 

6. G. Birkhoff: Lattice theory, 3rd edition. American Mathematical Society Collo- 
quium Publications, Vol. XXV. Providence, R.I.: AMS 1967 

7. R. Bird: Lectures on constructive functional programming. In: M. Broy (ed.): 
Constructive methods in computing science. NATO ASI Series. Series F: Computer 
and Systems Sciences 55. Berlin: Springer 1989, 151-216 

8. R.S. Bird, O. de Moor: Algebra of programming. Prentice-Hall 1996 

9. J.D. Brock, W.B. Ackerman: Scenarios: a model of non-determinate computation. 
In: J. Di'az, I. Ramos (ed.): Formalization of programming concepts. Lecture Notes 
in Computer Science 107. Berlin: Springer 1981, 252-259 




Ideal Stream Algebra 113 



10. M. Broy: Specification and refinement of a buffer of length one. In: M. Broy (ed.): 
Deductive program design. NATO ASI Series, Series F: Computer and Systems 
Sciences 152 . Berlin: Springer 1996, 273-304 

11. M. Broy: Functional specification of time sensitive communicating systems. In: M. 
Broy (ed.): Programming and mathematical method. NATO ASI Series, Series F: 
Computer and Systems Sciences 88. Berlin: Springer 1992, 325-367 

12. M. Broy, F. Dederichs, C. Dendorfer, M. Fuchs, T.F. Gritzner, R. Weber: The 
design of distributed systems — an introduction to FOCUS. Revised Version. 
Institut fiir Informatik der TU Miinchen, Report TUM-I9202-2 (SFB-Bericht Nr. 
342/2-2/92 A), 1993 

13. M. Broy, G. §tefanescu: The algebra of stream processing functions. Institut fiir 
Informatik, TU Miinchen, Report TUM-I9620, 1996 

14. M. Broy, K. Stplen: Specification and refinement of finite dataflow networks — a 
relational approach. In: H. Langmaack, W.-P. de Roever, J. Vytopil (eds.): Formal 
techniques in real-time and fault-tolerant computing. Lecture Notes in Computer 
Science 863. Berlin: Springer 1994, 247-267 

15. R.M. Burstall: Program proving as hand simulation with a little induction. Proc. 
IFIP Congress 1974. Amsterdam: North-Hollandl974, 308-312 

16. R.M. Burstall, J. Darlington: A transformation system for developing recursive 
programs. J. ACM 24, 44-67 (1977) 

17. K.M. Chandy, J. Misra: Parallel program design: a foundation. Reading, Mass.: 
Addison Wesley 1988 

18. J.H. Conway: Regular algebra and finite machines. London: Chapman and Hall 
1971 

19. B.A. Davey, H.A. Priestley: Introduction to lattices and order. Cambridge: Cam- 
bridge University Press 1990 

20. M. Davis: Inflnitary games of perfect information. In: M. Dresher, L.S. Shapley, 
A.W. Tucker (eds.): Advances in game theory. Princeton, N.J.: Princeton Univer- 
sity Press 1964, 89-101 

21. F. Dederichs, R. Weber: Safety and liveness from a methodological point of view. 
Information Processing Letters 36, 25-30 (1990) 

22. E.A. Emerson: Temporal and modal logic. In: J. van Leeuwen (ed.): Handbook of 
theoretical computer science. Volume B: Formal models and semantics. Amster- 
dam: Elsevier 1990, 995-1072 

23. M.S. Feather: A survey and classification of some program transformation ap- 
proaches and techniques. In L.G.L.T. Meertens (ed.): Proc. IFIP TC2 Working 
Conference on Program Specification and Transformation, Bad Tolz, April 14-17, 
1986. Amsterdam: North-Holland 1987, 165-195 

24. H.P. Gumm: Another glance at the Alpern-Schneider characterization of safety 
and liveness in concurrent executions. Information Processing Letters 47 , 291-294 
(1993) 

25. C.A.R. Hoare: Communicating sequential processes. London: Prentice Hall 1985 

26. C.A.R. Hoare: Conjunction and concurrency. PARBASE 90, 1990 

27. J.K. Huggins: Kermit: specification and verification. In: E. Borger (ed.): Specifi- 
cation and validation methods. Oxford: Clarendon Press 1995 

28. IFIP 94/97: IFIP WG 10.5 Verification Benchmarks. Web document under 
http : //goethe . ira.uka.de/hvg/benchmarks . html 

29. B. Jonsson: A fully abstract trace model for dataflow and asynchronous networks. 
Distributed Computing 7 , 197-212 (1994) 




114 



B. Moller 



30. G. Kahn: The semantics of a simple language for parallel processing. In: J.L. 
Rosenfeld (ed.): Information Processing 74. Proc. IFIP Congress 1974. Amsterdam: 
North-Holland 1974, 471-475 

31. B. Von Karger, C.A.R. Hoare: Sequential calculus. Information Processing Letters 
53, 123-130 (1995) 

32. J.N. Kok: A fully abstract semantics for data flow nets. In: J.W. de Bakker, A.J. 
Nijman, P.C. Treleaven (eds.): PARLE, Parallel languages and architectures Eu- 
rope, Volume I. Lecture Notes in Computer Science 259. Berlin: Springer 1987, 
351-368 

33. L. Lamport: Proving the correctness of multiprocess programs. IEEE Trans. Soft- 
ware Eng. SE-3, 125-143 (1977) 

34. L. Lamport: Specifying concurrent program modules. ACM TOPLAS 5, 190-222 
(1983) 

35. L.G.L.T. Meertens: Algorithmics — Towards programming as a mathematical 
activity. In: J. W. de Bakker et al. (eds.): Proc. CWI Symposium on Mathematics 
and Computer Science. CWI Monographs Vol 1. Amsterdam: North-Holland 1986, 
289-334 

36. R. Milner: Communication and concurrency. London: Prentice Hall 1989 

37. B. Moller: Relations as a program development language. In [38], 373-397 

38. B. Moller (ed.): Constructing programs from specifications. Proc. IFIP TC2/WG 
2.1 Working Conference on Constructing Programs from Specifications, Pacific 
Grove, CA, USA, 13-16 May 1991. Amsterdam: North-Holland 1991, 373-397 

39. B. Moller: Derivation of graph and pointer algorithms. In: B. Moller, H.A. Partsch, 
S.A. Schuman (eds.): Formal program development. Lecture Notes in Computer 
Science 755. Berlin: Springer 1993, 123-160 

40. B. Moller: Algebraic calculation of graph and sorting algorithms. In: D. Bjprner, 
M. Broy, I.V. Pottosin (eds.): Formal Methods in Programming and their Appli- 
cations. Lecture Notes in Computer Science 735. Berlin: Springer 1993, 394-413 

41. B. Moller, M. Russling: Shorter paths to graph algorithms. In: R.S. Bird, C.C. 
Morgan, J.C.P. Woodcock (eds.): Mathematics of Program Construction. Lecture 
Notes in Computer Science 669. Berlin: Springer 1993, 250-268. Science of Com- 
puter Programming 22, 157-180 (1994) 

42. B. Moller: Ideal streams. In: E.-R. Olderog (ed.): Programming concepts, methods 
and calculi. IFIP Transactions A-56. Amsterdam: North-Holland 1994, 39-58 

43. B. Moller: Refining ideal behaviours. Institut fiir Mathematik der Universitat 
Augsburg, Report Nr. 345, 1995 

44. B. Moller: Temporal operators on partial orders. Proc. 3rd Domain Workshop, 
Munich, May 29-31, 1997. Ludwig-Maximilian-Universitat Miinchen (to appear) 

45. B. Moller: Modal and Temporal Operators on Partial Orders. In: R. Berghammer 
(ed.): Programmiersprachen und Grundlagen der Programmierung. Institut fiir 
Informatik und Praktische Mathematik, Universitat Kiel (to appear). Extended 
version: Institut fiir Informatik der Universitat Augsburg, Report 97-02, 1997 

46. C.C. Morgan: Programming from Specifications. Prentice-Hall, 1990. 

47. J.M. Morris: A theoretical basis for stepwise refinement and the programming 
calculus. Science of Computer Programming 9, 287-306 (1987) 

48. B. Moszkowski: Some very compositional temporal properties. In: E.-R. Olderog 
(ed.): Programming concepts, methods and calculi. IFIP Transactions A-56. Am- 
sterdam: North-Holland 1994, 307-326 

49. M. Nivat: Behaviors of processes and synchronized systems of processes. In: M. 
Broy, G. Schmidt (eds.): Theoretical foundations of programming methodology. 
Dordrecht: Reidel 1982, 473-551 




Ideal Stream Algebra 1 15 



50. E.-R. Olderog: Nets, terms and formulas. Cambridge: Cambridge University Press 
1991 

51. E.-R. Olderog, C.A.R. Hoare: Specification-oriented semantics for communicating 
processes. Acta Informatica 23, 9-66 (1986) 

52. D. Park: On the semantics of fair parallelism. In D. Bjprner (ed.): Abstract software 
specifications. Lecture Notes in Computer Science 86. Berlin: Springer 1980, 504- 
526 

53. H.A. Partsch: Specification and transformation of programs — A formal approach 
to software development. Berlin: Springer 1990 

54. G.D. Plotkin: A powerdomain construction. SIAM J. Computing 5 , 452-487 (1976) 

55. R. Redziejowski: Infinite-word languages and continuous mappings. Theoretical 
Computer Science 43, 59-79 (1986) 

56. F.J. Rietman: A relational calculus for the design of distributed algorithms. Dis- 
sertation, University of Utrecht, 1995 

57. R. Sharp: Principles of Protocol design. London: Prentice Hall 1994 

58. M.B. Smyth: Power domains. J. Computer Syst. Sciences 16 , 23-36 (1978) 

59. M.B. Smyth: Topology. In: S. Abramsky, D.M. Gabbay, T.S.E. Maibaum (eds.): 
Handbook of logic in computer science. Vol. 1, Background: Mathematical struc- 
tures. Oxford: Clarendon Press 1992, 641-761 

60. L. Staiger: Research in the theory of ^-languages. J. Inf Process. Cybern. EIK 
23 , 415-439 (1987) 

61. L. Staiger: ^-languages. In: G. Rozenberg, A. Salomaa (eds.): Handbook of formal 
languages. Vol. 3: Beyond words. Berlin: Springer 1997, 339-387 

62. R. Stephens: A survey of stream processing. Acta Informatica 34, 491-541 (1997) 

63. K. Stplen, M. Fuchs: An exercise in conditional refinement. In: B. Moller, J.V. 
Tucker (eds.): Prospects for hardware foundations. Lecture Notes in Computer 
Science 1546 . Berlin: Springer (this volume) 

64. V. Stoltenberg-Hansen, I. Lindstrom and E.R. Griffor: Mathematical theory of 
domains. Cambridge Tracts in Theoretical Computer Science, Vol. 22. Cambridge: 
Cambridge University Press 1994 

65. W. Thomas: Automata on infinite objects. In: J. van Leeuwen (ed.): Handbook of 
theoretical computer science. Vol. B: Formal models and semantics. Amsterdam: 
Elsevier 1990, 133-191 

66. W. Thomas: Languages, automata and logic. In: G. Rozenberg, A. Salomaa (eds.): 
Handbook of formal languages. Vol. 3: Beyond words. Berlin: Springer 1997, 389- 
455 

67. J. Zwiers: Compositionality, concurrency and partial correctness. Lecture Notes in 
Computer Science 321 . Berlin: Springer 1989 



30 Appendix: Auxiliary Lemmas 

30.1 Cones and Maximal Elements 



Lemma 30.1. Consider N,P C M. Then 

1. {N VJ P)^ = U A {N VJ P)- = N-VJP- (distrihutivity). 

2. {N^)< = N< A (N^)^ = N^. 
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Lemma 30.2. Consider N,P C M. Then 

1. maxAi = N^\N<. 

2 . maxN = maxN-. 

3. N C P N n maxP C maxAi. 

4- maxN n P^ = ^ max{N U P) = maxN U {max P)\N^ . 

Lemma 30.3. Consider N,P C M. Then 

1. N <P ^ N <P^. 

2. LCNAN<PAPCQ^L<Q. 

3. N < P ^ N^ C P^. 

4- N < P max {N U P) = maxP. 

Lemma 30.4. Consider N,P C M. Then N P ^ N- = P-. 



Proof. Immediate from Lemma 30.3.3. 



□ 



30.2 Directed Sets 

Lemma 30.5. Consider N,P C M. Then 

1. U P G dirM A < P P G dirM. 

AiUPGdirM (Ai < P V P < A^) A (A^ G dir M V P G dir M). 

3. G dirM Q G dirM. 

4 . Ai < P A P G dirM Ai U P G dirM. 

5. dir (Ai U P) = {K U L ■. {K & d\r N A L C P A L < K)} U 

{K V2L ■. {L & (T\r P A K <Z N A K < L)} . 

Proof. 1. Assume x,y G P. By directedness of A" U P there is a 2 : G A U P with 
X < z and y < 2 . If 2 G P, we are done. Otherwise, by A < P there is a 
u £ P with 2 < u so that by transitivity also x <u and y <u. 

2. For A = 0 or P = 0 the claim is trivial. So consider N,P ^ ^ and suppose 
N ^ P. Then there is a; G A with x ^ P. Assume now y G P. By directedness 
of A U P there is a 2 G A U P with x,y < z. Since a; ^ P, it follows that 
z £ N\P C A. Since y was arbitrary, we have shown P < A. 

The second disjunct is immediate from the first and 1. 

3. Immediate from 1 by setting N = Q-, P = Q and using Q- < Q. 

4. Assume x,y £ N U P. By N < P and P < P there are u,v £ P with 

X < u A y < V. Since P is directed, there is 2 G P with u < z and v < z. 
Hence also x < z and y < zhy transitivity. 

5. We show ( C ); the reverse inclusion is immediate from 4. 

Consider Q £ dir (A U P). We have Q = K U L where K Q Cl N and 
L =*' Q n P. By 2 we know K<LVL<K.liK<L then L G dir P by 1. 
If L < A then K G dir A by 1. This shows the claim. 



□ 
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Abstract. We extend normalization by evalnation (first presented in 
[4]) from the pnre typed A-calcnlns to general higher type term rewrite 
systems. This work also gives a theoretical explanation of the normaliza- 
tion algorithm implemented in the MiNLOG system. 



1 Introduction 

In interactive proof systems it is crucial to have a term rewriting machinery 
available, in order to ease the burden of equational reasoning. Quite often term 
rewriting can be reduced to normalization and therefore it is essential to imple- 
ment normalization of terms efhciently. By the same token, one then can also 
effectively normalize whole proofs (which can be written as derivation terms, 
using the CuRRY-HoWARD correspondence). Normalization is used when ex- 
tracting terms from formal proofs. For an application concerning circuits cf. 
[ 12 ]. ^ 

It is well known that implementing normalization of A-terms in the usual 
recursive fashion is quite inefhcient. However, it is possible to compute the long 
normal form of a A-term by evaluating it in an appropriate model (cf. [4]). When 
using for that purpose the built-in evaluation mechanism of e.g. Scheme (a pure 
Lisp dialect) one obtains an amazingly fast algorithm called “normalization by 
evaluation”. The essential idea is to Rnd an inverse to evaluation, converting a 
semantic object into a syntactic term. This normalization procedure is used and 
tested in the proof system MiNLOG developed in Munich (cf. [2]). 

Obviously, for applications pure typed A-terms are not sufhcient, but it is 
necessary to have constants in it. These were not considered in [4], but will be 
treated in this paper. 

Let us begin with a short explanation of the essence of the method for nor- 
malizing typed A-terms by means of an evaluation procedure of some functional 
programming language such as Scheme. For simplicity we return to the simplest 
case, simply typed A-calculus without constants. 

Simple types are built from ground types t hy p ^ <r (later also prod- 
ucts p X a will be included). The set A of terms is given by , 

N^Y . The set Lne of terms in long normal form (i.e. normal w.r.t. /3- 
reduction and ry-expansion) is defined inductively by (xMi . . . M„Y , XxM (we 
abbreviate xMi . . . Mn by xM and similar a list Mi . . . Mn by M). By Inf(M) 
we denote the long normal form of M , i.e. the unique term in long normal form 
/3ry-equal to M . 

B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 1 17-137, 1998. 

Springer- Verlag Berlin Heidelberg 1998 
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Now we have to choose our model. A simple solution is to take terms of 
ground type as ground type objects, and all functions as possible function type 
objects: 



[r] := At, \p ^ cr] := (the full function space). 

It is crucial that all terms (of ground type) are present, not just the closed ones. 
Next we need an assignment ] lifting a variable to an object, and a function 
[ giving us a normal term from an object. They should meet the following 
condition, which might be called “correctness of normalization by evaluation”: 

i([Mlt) = Inf(M) 

where G [p] denotes the value of M under the assignment ). Two such 

functions [ and ] can be dehned simultaneously, by induction on the type. It is 
convenient to dehne ] on all terms (not just on variables). Hence for every type 
p we dehne : [p] ^ Ap and ^ [p] (called reify and rehect) by 

U{M)--=M, T.(M):=M, 

Ip^a(a) := A*i^(a(t,,(*))) new”, ^p^AM)(a) := ^„(Mlp(a)). 

Here a little difficulty appears: what does it mean that x is new? We will solve 
this problem by slightly modifying the model and dehning [r] to be the set of 
families of terms of type r (instead of single terms) and setting lp^„(a)(k) := 
A*fc(i,,(a(t/*“)))(A: + 1)), where is the constant family X].. The dehnition of 
] p^a has to be modihed accordingly. This idea corresponds to a representation 
of terms in the style of DE Bruun [9]. An advantage of this approach is that 
we get the same normal form even if the terms are only equal up to renaming of 
bound variables. 

The proof of correctness is easy: Since for the typed lambda calculus without 
constants we have preservation of values, i.e. [M]^ = |lnf(M)]^ for all terms 
M and environments we only have to verify i([?V]|) = N for normal terms 
At, which is straightforward. The situation is different when we add constants 
together with rewrite rules, since then preservation of values (in our model) is 
false in general. However, correctness of normalization by evaluation still holds, 
but needs to be proven by a different method. 

The structure of the paper is as follows. In section 2 we present the simply 
typed A-calculus with constants and pairing and give some examples of higher 
order rewrite systems. Then we inductively dehne a relation M — ^ Q, with 
the intended meaning that M is normalizable with long normal form Q, and 
prove in section 4 the correctness of normalization by evaluation by showing 
that M — ^ Q implies i([Af]|) = Q. Hence the mapping M i([^lt) 
a normalization function. In order to dehne the semantics [M] of a term M 
properly we use domain theory. This is described briehy in section 3. 

In subsection 4.2 we show how to interpret a constant c more efficiently if 
all its rules are of some special forms. In fact, most of the rewrite rules in the 
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Minlog system have one of these forms. Again, normalization by evaluation is 
shown to remain correct. 

The hnal section 5 contains a review of the relevant literature [1,5, 7, 8], 
and also a comparison of the run-times of our algorithm with those coming 
from the other setups. The results clearly support our claim that the present 
approach (with function spaces as higher type domains) leads to a much faster 
implementation. 

Acknowledgements. The present work has benehtted considerably from ideas 
of Felix Joachimski and Ralph Matthes concerning strategies for normalization 
proofs, including ry-expansion and primitive recursion. In particular, the idea to 

employ the inductive dehnition of the relation M ^ Q is essentially due to 

them. We also want to thank Holger Beni for illuminating discussions. 



2 A simply typed A-calculus with constants 



2.1 Types and terms; rewrite rules 



We start from a given set of ground types. Types are inductively generated from 
ground types t hy p ^ a and p x a. Terms are 



cP 

{\xPM^)P^^ 

{MP^^NPy 

{Mynyp^^ 



typed variables, 

constants, 

abstractions, 

applications, 

pairing, 

Y projections. 



Ground types will always be denoted by r. We sometimes write MO for 7 To(M) 
and Ml for 7Ti(M). Two terms M and N are called a-egual - written M =„ N 
- if they are equal up to renaming of bound variables. Ap denotes the set of 
all terms of type p. MN denotes (. . .{^M N\)N 2 . . .)77„, where some of the W’s 
may be 0 or 1. By FV(M) we denote the set of variables occurring free in M . 
By Mx[N] we mean substitution of every free occurrence of * in M by #, 
renaming bound variables if necessary. Similarly Mx[N] denotes simultaneous 
substitution. Finally p ^ a stands for pi (p 2 . . .(pn *^) • • • ) Xxr 

abbreviates Xxi . . .Xx„r. 

For the constants cP we assume that some rewrite rules of the form cK i — ^ N 
are given, where FV(At) C FV(2T) and cK , N have the same type (not necessarily 
a ground type). Moreover, for any two c-rules cK i — ^ N and cK' i — ^ N' with 
equal projection markers 0, 1 we require that K and K' are of the same length. 
For example, if c is of type (r ^ r ^ r) x (r ^ r), then the rules c0*i*i i — ^ a 
and cl* I — ^ h are admitted. 

Since we allowed almost arbitrary rewrite rules, it may happen that a term 
can be rewritten by different rules. In order to obtain a deterministic procedure 
we assume that for every constant we are given a function sel^ computing 

for every tuple of terms either a rule cK i — ^ N , in which case M is an 
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instance of K, i.e. M = Kx[L], or else the message “no-match”, in which case 
M doesn’t match any rule, i.e. there is no rule cK i — ^ N such that M is an 
instance of K . Clearly seh should be compatible with a-equality. 

For readability a rule cK i — ^ Ay# with y distinct variables not free in K 
will be written cKy ^ N. 

2.2 Examples 

(a) Usually we have the ground type r of natural numbers available, with con- 
structors 0‘, Suc‘^‘ and recursion operators , The rewrite 

rules for R are 



ROyz I — ^ y, 

R(Svc x)yz I — ^ zx(Rxyz), 

i.e. RO I — ^ \y\zy and i?(Suc *) i — ^ XyXz(zx(Rxyz)). A simplihed scheme of 
the same form gives a cases construct: 

if Ot/z I — ^ t/, 
if (Sue x)yz I — ^ z. 

We may also add a rewrite rule (due to McCarthy [11]) 

\-f (\-f xyz)uv I — ^ if *(if j/Mr>)(if ZMr>). 

Moreover we can write down rules according to the usual recursive dehnitions of 
addition and multiplication, and then also the rewrite rule 

mult(add xy)z i — ^ add(mult Tz)(mult j/z). 

Simultaneous recursion may be treated as well, e.g. 

odd 0 I — ^ Sue 0 even 0 i — ^ 0, 

odd(Suc x) I — ^ even x even(Suc x) i — ^ odd x. 

(b) We can also deal with inhnitely branching trees like the Brouwer ordinals 

of type O. We have constructors 0‘^ and and for recursion constants 

. The rewrite rules for Rec are: 

RecOj/z I — ^ y, 

Rec(Sup x)yz I — ^ zx(Xu KEc(xu)yz). 

(c) It is well known that by the CuRRY-HoWARD correspondence natural deduc- 
tion proofs can be written as A-terms with formulas as types. To use normaliza- 
tion by evaluation for normalizing proofs we may also introduce a ground type 
ex with constructors and destructors 

ex^(po^Pi^<j)^a . 



+ \po^Pi^ex 

Po.Pi / 



and i^p„,p,,a)' 
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these are called existential eonstants. The rewrite rule for 3 is 

3“(3+*o*i)t/ ' — ^ yxoxi. 

The (constructive) existential quantiher can then be dealt with conveniently by 
means of axioms 

3+: ^x{A^3xA), 

3“ : 3xA Vx(A B) ^ B with x ^ FV(5). 

If X has type po and the formulas A and B are associated with the types pi 
and <7, respectively, the rewrite rule above is clear. It seems that the existential 
type ex could be replaced by po x pi and the constants 3+ and 3“^ ^ by the 

terms A* 0 '^®i(®o, ®i) and AzA/(/7ro(z)7Ti(z)) respectively. However, the latter 
term does not correspond to a derivation in Rrst order logic, since it is impossible 
to pass from an arbitrary derivation el (possibly with free assumptions) of 3xA 
to a term 7To((i) and a derivation 'n'i(el) of Ax[ieo{d)'\. 



2.3 Normalizable terms and their normal forms 



We inductively define a relation M ^ Q for terms M, Q. The intended meaning 

of M ^ Q is that M is normalizable with (long) normal form Q. Here it is 

convenient to identify a-equal terms. 

Eta. 



My^Q 

MP^<^ , XyQ 



for y ^ FV(M) 



7To(M) y Qo 7Ti(M) — y Qi 

MPX- {Qo,Qi) 



For the next rules it is enough that they all have a conclusion M ^ Q with 

M , Q of a ground type r. 

Beta. 



M4N]P^Q 



MiP Q 



(XxM)NP^Q tu({Mo,Mi))P 
VarApp. 

M — . M' 



for i e {0, 1} 



xM — ^ xM' 

M — ^ M' abbreviates the list Mi — ^ M[, . . . , ^ of assumptions. 

Moreover, for every constant c we have the following rules. 

Red. 



M — > M' N^[L]P — >Q . 



cMP — ^ Q 
PassApp. 

M — P — P' 
— ^ cM'P' 



if seh(M') = clT I — > N and M' = K^[L] 



cMP 



if se\c(M') = no-match 
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For readability we will write Red in the following form and always assume 
that cK I — ^ N is the selected rule. 

M^K^[L] N^[L]P^Q 

cMP — ^ Q 

The set Lnf of terms in long normal form is dehned as follows: XxM , (M, N), 
xM and cMN are in Lnf if M, N, M , N are and se\c{cM) = no-match. It 
is obvious that M — ^ Q implies that Q G Lnf. Furthermore it can be shown 
that in this case M normalizes to Q in the usual sense w.r.t. /3-reduction, rj- 
expansion and the rewrite rules for the constants. Conversely, if M is strongly 
normalizable w.r.t. these reductions (i.e. every reduction sequence terminates) 
then M — ^ Q for some Q. We omit the proofs of these facts since they will not 

be needed in the sequel. Although M ^ Q implies that Q is a normal form of 

M , the converse is not true in general. To see this, consider the non-terminating 
rewrite rules mult*‘0 i — ^ 0 and T‘ i — ^ T. Then 0 is a normal form of muItTO, 

but muItTO — ^ 0 does not hold. Moreover, the relation M ^ Q clearly is not 

closed under substitution. However, it is closed under substitution of variables. 

Lemma 1. If M — ^ Q, then Mx[z] — ^ with a derivation of the same 

height. 

Proof. We use induction on the height of the derivation of M — ^ Q, and only 
treat the rule Eta 

for y FV(M) 

with M of type p ^ a. In case x = y there is nothing to show, since then x does 
not occur free in the conclusion M — ^ XyQ. So assume x y. 

Subease y z. We must show Mx[z] ^ XyQx[z]. By induction hypothesis 

Mx[z\y — ^ Qx[z], since x y. Therefore by Eta Mx[z] — ^ ^yQx[z], since 
y (f FV(Mj;[z]) (because of j/ z). 

Subease y = z. We must show Mx[y] — ^ {^yQ)x[y\ = ^uQx^yly, u] with a 
new variable u. By induction hypothesis Mu — ^ Qy[i^] with a derivation of 

the same height as that for My ^ Q, hence again by induction hypothesis 

Mx[y]u — ^ Qx,y[y,u]. Therefore by Eta Mx[y] — ^ XuQx,y[y,u]. 

2.4 Term families 

Since normalization by evaluation needs to create bound variables when “reify- 
ing” abstract objects of higher type, it is useful to follow DE Bruun’s [9] style of 
representing bound variables in terms. This is done here - as in [4] - by means of 
term families. A term family is a parametrized version of a given term M . The 
idea is that the term family of M at index k reproduces M with bound variables 
renamed starting at k. For example, for 



M := XuXv .c(Xx .vx)(XyXz .zu) 
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the associated term family M°° at index 3 yields 

M°°(3) := Xx3Xx4.c(Xx5.X4X5)(Xx5Xxe.xeX3). 

We denote terms by M, N, K, . . . , and term families by r, s,t, . . . . 

To every term we assign a term family M°° : N ^ Tp by 

x°°{k) := X, 

(XyMrik) ■.= Xx,{My[x,r{k+l)), := {M^ (k), (k)) , 

{MN)°°{k) := M°°{k)N°°{k), Tri{M)°°{k) := Tri{M°°{k)). 

Application of a term family r: N ^ to a term family s : N ^ Tp is 

the family rs : N ^ T<j dehned by (rs)(k) := r(k)s(k), and similarly for pair- 
ing {ro,ri)(k) := (ro(fc), ri(fc)) and projections iTi(r)(k) := 7rj-(r(A;)). Hence e.g. 

Note that in contrast to our convention to consider terms up to bound re- 
naming, the dehnition of (XyM)°° refers to a particular choice of the bound 
variable y. Part a of the following lemma shows that nevertheless this dehnition 
is compatible with our convention. In the rest of this subsection we drop the 
convention to identify a-equal terms. Hence M = N means that M and N are 
literally identical as opposed to M =„ N , which means equality up to bound 
renaming. 

We let k > FV(M) mean that k is greater than all i such that G FV(M) 
for some type p. 

Lemma 2. a. If M =„ N, then = N°° . 
b. Ifk> FV(M), then M°°(k) =„ M. 

Proof, a. Induction on the height of M. Only the case where M and N are 
abstractions is critical. So assume XyP M =„ Xz^N. Then My[P] =„ Nz[P] 
for all terms PP. In particular My[x}f\ =„ for arbitrary A; G N. Hence 

My[x-k\°° {k -|- 1) = N^[x-k\°°{k -F 1), by induction hypothesis. Therefore 

{XyMr(k) = Xxp{My[xpr{k + 1)) = Xxp{N4xpr{k + 1)) = (Az#)“(fc). 

b. Induction on the height of M. We only consider the case XyM. The assump- 
tion k > FV(Aj/M) implies xp ^ FV(Aj/M) and hence XyM =„ Xxk(My[xk]). 
Furthermore A; -F 1 > f\/(My[xk]), and hence My[xkY^ (k + 1) =„ My[xk], by 
induction hypothesis. Therefore 

(XyM)°°(k) = XxkiMylxk]^^ (k + 1)) =„ Xxk(My[xk]) =„ XyM. □ 

Let ext(r) := r(k), where k is the least number greater than all i such that 
some variable of the form occurs (free or bound) in r(0). 

Lemma 3. ext(M°°) =„ M. 

Proof. Use part b of the lemma above, and the fact that ext(M°°) =„ M°°(k) 
where k > FV(M). 
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3 Domain theoretic semantics of simply typed A-calculi 

In this section we shall discuss the domain theoretic semantics of simply typed 
lambda calculi in general. Although the constructions below are standard (see 
e.g. the books of Lambek/Scott [10] or Crole [6]), we discuss them in some 
detail in order to make the paper accessible also for readers not familiar with this 
subject. Most constructions make sense in an arbitrary cartesian closed category 
(ccc). However we will conhne ourselves to the domain semantics and will only 
occasionally comment on the categorical aspects. 

It is well-known that ScOTT-domains with continuous functions form a carte- 
sian closed category DoM. The product D x E is the set-theoretic product with 
the component-wise ordering. The exponential [D E] is the continuous func- 
tion space with the pointwise ordering. The terminal object is the one point 
space 1 := {T} (there is no initial object and there are no coproducts). In order 
to cope with the categorical interpretation we will identify an element * of a 
domain D with the mapping from 1 to D with value x. 

Besides the cartesian closedness we also use the fact that DoM is closed under 
inhnite products and that there is a Rxed point operator Fix : (D D) ^ D 
assigning to every continuous function f : D ^ D its least fixed point Fix(/) G 
D. Furthermore we will use that partial families of terms form a domain and 
some basic operations on terms and term families are continuous and hence exist 
as morphisms in the category. Any other ccc with these properties would do as 
well. 

Elements of a product domain D\ x ■ ■ ■ x D„ are written [ai,...,a„]. If 
/ e [Di [D 2 ...[Dn E]...]] and a,- £ A, then /(ai, . or /(a) 

stands for /(ai) . . . (on)- 

An mterp retail on for a given system of ground types is a mapping T assigning 
to every ground type r a domain T(t). Given such an interpretation we define 
domains for every type p by 

[rf := T(r), [p ^ af := [[pf ^ ], [p x af := [pf x . 

We write [pi, . . . , p„]^ := [p^ x • • • x p„]^ = {pif x ■■■ x [p„]^ =: [p]^. An 
interpretation of a typed lambda ealeulus (specified by a set of ground types and a 
set of constants) is a mapping I assigning to every ground type r a domain T(t) 
(hence I is an interpretation of ground types), and assigning to every constant 
E a value T(c) G [p]^ (i.e. a morphism from 1 to [p]^). 

In order to extend such an interpretation to all terms we use the following 
continuous functions, i.e. morphisms (in the sequel a continuous function will be 
called morphism if its role as a morphism in the ccc DoM is to be emphasized): 

\d '■ D ^ 1 , !_D(d) := T 

TTi'. Di X ■ ■ ■ X Dn ^ Di, 7Ti([a]) := a,-, 

curry ■. [D x E ^ E] ^ [D ^ [E ^ E]], curry(/, a, h) := f{[a, b]), 
eval : [D ^ E] x D ^ E, eval(/, a) := /(a). 
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Furthermore we use the fact that morphisms are closed under composition o and 
(since DoM is a ccc) under pairing (., .), where for f : D ^ E and g: D ^ F the 
function {f,g): D ^ E x F maps a to [f(a),g(a)]. For every type p and every 
list of distinct variables , • • • , we let Ap(x) denote the set of terms 

of type p with free variables among {*}. Let I be an interpretation. Then for 
every M G Ap{xf ) we dehne a morphism [M]^: [p] ^ [p] by 

Ml :=2^(c)°![p], 

{XxMll := curry([M]^_^), 

{MNll :=evalo([Ml^,[#]^), 

l^iWll := TTi o {Mil. 

This dehnition works in any ccc. For our purposes it will be more convenient to 
evaluate a term in a global environment and not in a local context. Let 

Env := [cr]^ e Dom. 

For every term M G Api^xi, . . . , x„) we dehne a continuous function 

[Mf : Env ^ {pf , [M]f := . . . , e(*„)]). 

Formally this dehnition depends on a particular choice of the list of variables 
x\, . . . , x„. However, because of the well-known coincidence property in fact it 
does not. 

Lemma 4. If M ^ J^p{yi , ■ ■ ■ ,ym) then 

[M]f = {Mfy{f{yi), . . .,^(t/m)). 

From this we easily get the familiar equations 

[c]f = T(c), 

Wf = 

{XxMllia) = 

[M#lf = 

[7Ti(M)]f = 7Ti([M]f). 

In many cases the interpretation I of the constants will have to be dehned re- 
cursively, by e.g. referring to [M]^ for several terms M . This causes no problem, 
since the functionals Env ^ [p] depend continuously on T, where X is 
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to be considered as an element of the infinite product U^p [p]. This can be seen as 
follows: Looking at their dehnitions we see that the functions [T, a] 
are built by composition from the continuous functions 

7Tc.x : iT^p[p] ^ [cr], \=I{c), 

■o--.[E F]x[D E]-^[D F], 

(•, .)-.[D ^ E]x[D ^ F]^[D ^ E X F], 

as well as the functions tTj-, curry and eval above. Hence [T, a] 
continuous. But then also [^If i® continuous, since we have |M]|' = 

where tt^p: Env ^ [p], 7 t,p(^) :=^(*). 

Hence the value T(c) may be dehned as a least Rxed point of a continuous 
function on the domain iTcp[p]. - In the sequel we will omit the superscript I 
when it is clear from the context. 

The following facts hold in any ccc. 

Lemma 5. 

[Ma;[7V]]^ = [dL]^[aM-^[AT] 4 ] (substitution lemma) 

l(XxM)N}^ = IM^[N]}^ (beta 1) 

[7Ti((Mo, = [Mi]^ (beta 2) 

[M]e = [Ar/(Mj/)]e (yP (etal) 

Me = [(7ro(M),7Ti(M))]e (eta 2) 

Lemma 6. If = [Q]^ for all environments (, and M is transformed into 
N by replaeing an oeeurrenee of P in M by Q, then [M]^ = [At]^ for all envi- 
ronments (. 

Proof. Induction on M . 

Lemma 7. If M — N or M —^n N, then [M]^ = 

4 Normalization by evaluation 

We now consider a special model, whose ground type objects are made from 
syntactic material. We let N — ^ Ap denote the set of partial term families, i.e. 
partial functions from the integers to the set of terms of type p. N partially 

ordered by inclusion of graphs is a domain. We interpret the ground types by 

This gives us an interpretation [p] = [p]^ G DoM for every type p. In order to 
define the interpretation of the constants, some preparations are necessary. 
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4.1 Reification and reflection; interpretation of the constants 

For every type p we define two continuous functions 

ip : M ^ (N dp) ( “reify” ) |p : (N dp) ^ [p] ( “reflect” ) . 

These functions will be used when dehning the interpretation of the constants 
as well as in our flnal goal, normalization by evaluation. and are dehned 
simultaneously by induction on the type p. 

i^(r):=r, |^(r) := r, 

ip^,,(a)(A;) := p{x^))){k+l), ^p^„(r)(h) := t,,(rip(&)), 

ipx<7([a, b]) ■■= (ip(a), ia{b)), pxa{r) ■= [tp(7ro(r)), t,,(7ri(r))]. 

Note that for ai £ [p,-] we have ]p^„{r){ai, . . ,a„) = t<j(ipi(«i) • • 4p„(an), to 
which we refer by 

t(r)(a) = t(ri(a)). (1) 

We now dehne the values of the constants c in our special model. Note that 
we can view ] as an environment: it assigns to every variable x of type p the 
value t(*°°) e M- 

' if seh(ext(i(a))) = cK ^ — >N and ext{i{a))=K^[L] 

J{c){a) := < t(c°°J,(a)) if seh(ext(J,(a))) = no-match 

_T otherwise, i.e. ext(J,(a)) is undehned. 

Note that this in general is a recursive dehnition, since the terms N and L 
may contain c. We now obtain the correctness of normalization by evaluation: 

Theorem 1. If M — ^ Q, then i(Mt) =0“. 

Proof. By induction on the height of the derivation of M — ^ Q. For brevity 
we leave out the rules concerning product types, since their treatment does not 
bring up any new issues. 

Case Eta, i.e. 

for y (f FV(M) 

with M of type p ^ a. By lemma 1 we then have Mx^ — ^ Qy[xk] with a 
derivation of the same height as that of the given derivation of My — ^ Q, hence 
by induction hypothesis i([dfTj,]|) = Qp[Tp]°°. We obtain 

i([MlT)(fc) = Xxk (i([Mlt(t(Tn))4 + 1)) 

= ([dfTfcJl) (fc + 1)^ 

= Atj, (Qy[xk]°°(k + 1)^ 

= iXyQrik). 



by IH 
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In the following cases the rules are of ground type r, so we use that = 

[M]| for M e Ar . 

Case Beta, i.e. 

M4N]P 

{\xM)NP — ^ Q 

l(XxM)N PJ^- = by lemma 7 

= by IH. 

Case Red, i.e. 

M^K4L] N4L]P^Q 

cMP — ^ Q 

where cK i — ^ N is the selected rule, i.e. seh(^fa^[.^]) = cK i — ^ N. Recall 
[cMPlt =I(c)([Mlt)([PlT). 

By induction hypothesis = Kx[L]°° . By dehnition of I{c) we have 

to compute seh(ext(J, )). By lemma 3 ext(^fa^[.^]°°) = Kx[L], Hence 
selc(ext(J,([A4']|))) = se\c[Kx[L]) = cK i — ^ N, and therefore 

[cMPlt = I(c)([Mlt)([PlT) 

= [^1[xh^[T]t]([-P1t) by dehnition of 2(c) 

= |77a;[P]]|([P]|) by the substitution lemma 

= iNx[Ln 

= by IH. 

Case VarApp, i.e. 

M — > M' 

xM — ^ xM' 

[TMlt = T(*“)([Mlt) 

= T“i([Mlt) by(l) 

= t“(M')“ bylH 

= 

Case PassApp, i.e. 

M — >M' P — P' 
cMP — ^ cM'P' 



where there is no rule cK i — ^ N such that M' is of an instance of K, i.e. 
se\c{M') = no-match. We obtain by induction hypothesis i([A4']|) = M'°° and 




Normalization by Evalutation 
hence seh (i (|M]|)) = seh (ext(M'°°)) = selc(A4’') = no-match. Hence 



129 



[cMPlt=2:(c)([Mlt)([PlT) 

= ([-PIt) by definition of 2(c) 

= c“i([M,Plt) by(l) 

= c“(M')“(P')“ bylH 

= (cM'P')“. 



□ 



4.2 Special rewrite rules 

We will now consider special cases of our general form of rewrite rules - to 
be called d-, e- and /-rules where the interpretation function X and hence 
normalization by evaluation is more efficient. It will be shown that theorem 1 
continues to hold. 



(/-rules. One problem with the interpretation of the constants in 4.1 is a cer- 
tain inefficiency inherent in it: after computing a syntactic reification of a by 
ext(J,(a)), determining the relevant rule cK i — ^ N as seh(ext(J,(a))) and reading 
off L such that ext(J,(a)) = Kx[L], we must go back to semantics and compute 

Mr- 

However, in many cases we have rules of the special form 

dN4K]^P4K], ( 2 ) 

with K of ground types and FV(P) C z. Notice that N z[K]^[L] = N z[Kx[L]] 
and = Pz[Kx[L]\. So for instance RO i — ^ \y,z.y and i?(Suc*) i — ^ 

At/, z.zx(Rxyz) are instances of this kind of rules. Then it is tempting to define 



I{d)ia) 






if seld(ext(i(a))) = dN^K] ^ P^K] 
and ext(i(a)) = Nz[Kx[L]] 
if seld(ext(J,(a))) = no-match 
otherwise. 



Here the necessity to evaluate L no longer exists; rather, we can work with the 
term family Kx[L]°° directly. 



e-rules. One may still not be satisfied with the d-rules, since each time X(d)(a) 
is evaluated we need to compute sel(/(ext(J,(a))) in order to determine the rule to 
be applied. Now for a/ of ground type each r/ := J,(a/) is a term family, and the 
computation of ext(r/) = r/fc) involves computing k from the set of all free and 
bound variables of the term r/(0). However, in many cases it suffices to evaluate 
the term family r/ at a fixed index, say 0. 
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Consider e-rules of the form 



eK^P^K] (3) 

with K of ground types and FV(_P) C z, and assume that for normal terms M 

seh(M“(0)) =seh(M). (4) 



This is particularly likely to hold if the terms K are A-free, since the formation 
of term families Kx[L]°° only affects names of A-bound variables. Now dehne 



X(e)(r) 



'iPl^^r] ifseUr(0)) = eK^P4K] 
< 'f(e°°T') if sele(T’(0)) = no-match 
T otherwise. 



Note that if we would interpret e according to the d-rules, we would have 
[C’][ 2 H^ext(r)“] ill the first case. But ext(T’)°° and r will be different in general. 

Rules of the form (3) satisfying (4) are particularly possible if we work within 
the ground type 4 of natural numbers and have the predecessor function available, 
as a Rxed constant with a fixed interpretation. 



/-rules. There is a final type of rules - to be called /-rules - which we want 
to consider. Their usefulness comes up when we work with concrete data types 
like the type 4 of natural numbers and want to employ e.g. the usual polynomial 
normal form of terms. For instance, the term (n-l-3)(r4^-|-2r4-l-5) should normalize 



to n^-l-5r4^-|-llr4-|-15. Abstractly, what we have here is a function norm : Ar ^ Aj- 
for a ground type r, and it will turn out that all we need to prove theorem 1 are 
the following properties: 

norm^ = norm (5) 

norm(M“(A:)) = norm(M)“(A:) (6) 

if M is cde-normal, then norm(M) is cde-normal (7) 



where cde-normal refers to the c-rules, the d-rules and the e-rules (and of course 
the /3-rule). Given such a function norm, we add all rules of the form 

fM I — ^ norm(/A4') provided both are different. (8) 

with M and fM of ground types; these should be the only rules for /. We define 
the sel f function by 

I (M) ^ norm(/A4') if fM and norm(/A4') are different, 

^ 1 no-match otherwise. 

The interpretation is defined by 

X{f){r){k) := norm{fr{k)). 
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Extension of theorem 1. We will show that theorem 1 (i.e. M ^ Q implies 

J,([M]t) = Q°°) continues to hold for such constants d, e and / with interpre- 
tations as given above. 

Lemma 8. For normal terms M we have 

=M“. 

Proof. Induction on the height of M . 

Case Xy'^M^ . This is similar to, but a little simpler than the case Eta in 
theorem 1. 



= Xx, (i([At/Mlt(T(Tn))(fc + 1)) 

= + 1 )) 

= Xxk{My[xk]‘^(k + 1)) by IH 

= (At/M)“(fc). 

Case {M, N) . Easy. 

Case (xMy. 

IxMy = t(T“)([Mlt) 

= T“i([Mlt) by(l) 

= by IH 

= {xM)°°. 

Case (cMPy. Since cM P is normal, we have seh(AL) = no-match by 
our general requirements on seh, and by induction hypothesis and lemma 3 
selc(ext(J,([M]|))) = seh(ext(AL°°)) = seh(AL) = no-match, hence 

lcMpy = i{cyiMyyipy) 

= t(c“i([Mlt))[PlT 

= c“i([Mlt)i([PlT) by(l) 

= c’=°M°°P°° bylH 

= {cMP)°°. 

Case (dMPy . We have only used that the interpretation of the constants c 
satisRes X(c)(a) = |(c°°J,(a)) if selc(ext(J,(a))) = no-match, and this also holds 
for the constants d. 

Case (eMPy . Since eMP is normal, we again have sele(AL) = no-match. 
By induction hypothesis and (4) sele([AL]|(0)) = sele(AL°°(0)) = sele(AL) = 
no-match. The proof then proceeds as before. 
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Case (fMy. Since fM is normal, we have fM = norm(fM), hence 



= norm(/[M]|(A;)) 




= norm( f M°° (k)) 


by IH 


= norm((/M)“(fc)) 




= norm(/M)“(A;) 


by (6) 


= (fM)^{k). 





Theorem 2. If M — ^ Q w.r.t. all types of rules above, then 

Proof. As for theorem 1; we only have to add an argument for the new rules in 
case Red. 

Case dN^K] i — ^ P^K] for Red, i.e. 

M ^ N^K^L]] P4K4L]]P^Q 
dMP — ^ Q 

where se\d{N 4K^[L]]) = dN^K] ^ P^K], i.e. dN^K] ^ P^K] is the 
selected rule. Recall 

[dMPlt =T(d)([Mlt)([PlT). 

By induction hypothesis we have = N z[Kx[L]]°° . Now by lemma 3 

ext(7V4Rr,,[P]]“) = Nz[K4L]], so 

seU(ext(i([M]t))) = seU(7V4Rr,,[P]]) = dNz[K] ^ Pz[K]. 

Hence 

[dMPlt = T(d)([Mlt)([PlT) 

= [-P1[zi^k.[L]“]([-P1t) by dehnition of 1(d) 

= [-Pl[zi^[K.[L]]^]([-Plt) see below 

= [Pz [.h]]]|([P]|) by the substitution lemma 

= lPz[K4L]\Py 

= Q°° by IH. 

R remains to show that Kx[L]°° = Now A4" — ^ N z[K]^[L], hence 

N z[K x[L]] G Lne, and so all subterms of ground type, in particular Kx[L], are 
also in Lne. Hence = i([K4P]it) = K^Lr by lemma 8. 

Case eK \ — ^ Pz[K( for Red, i.e. 



M 



K^L] Pz[K4L]]P^Q 

eMP ^ Q 
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where eK \ — ^ Pz[K] is the selected rule, i.e. sele(^fa^[-^]) = ^K\ — tP^\K\. 
By induction hypothesis we have = Kx[LY^ , hence seh([M]|(0)) = 

seh(K,„[i:]“(0)) = se\e{K^[L]) = eK< — yPz[K] by (4). Therefore 

[eMPlt=T(e)([Mlt)([PlT) 

= [-P1[zh^[M]^]([-P1t) by dehnition of I{e) 

= [-P1[zi^k.[L]“]([-P1t) by IH 

= [-P1[zh^[k.[L]]^]([-P1t) since IK^[L]}^ = K^[L]°° by lemma 8 
= |Pz[-^Ta;[P]]]|([P]|) by the substitution lemma 

= iP4K4L]n 

= Q°° by IH. 

Case fK — ^ norm(/2T) for Red, i.e. 

M — >K norm(fK) — ^ Q 

fM 

Note that fK is cde-normal, hence by (7) norm(/2T) is cde-normal as well. 
Moreover because of (5) we have norm^(/2T) = norm(/2T), so norm(/2T) is also 
normal w.r.t. the /-rules. Now by induction on the rules for — ^ one can see at 

once that for normal M , M ^ Q implies M = Q. Therefore, in our instance of 

the rule Red above we actually have norm(/2T) = Q. Now we obtain 



= norm(/[M]|(A;)) 


by dehnition ofT(/) 


= norm(/2T°° (A;)) 


by IH 


= norm((/2T)“(fc)) 




= norm(/2T)“(A;) 


by (6) 


= 0“(fc). 





5 Comparison 

The difference of the present approach to those of other authors is mainly that 
^ is modelled simply by the function space. This means that we only use the 
properties of application and abstraction given in the dehnition of a cartesian 
closed category. Due to this fact we can use an internal evaluation of a program- 
ming language like Scheme. In other approaches [1,5,7] evaluation has to be 
dehned by hand. A comparison between an internal and a hand made evaluation 
shows that the former is much more efficient. We tested this in Scheme and 
present a table with the run-times for some appropriate examples below. 



5.1 Related work 

In [1] Altenkirch, Hoemann and Streicher give a categorical explanation of 
normalization by evaluation. Therefore they describe terms as morphisms from 
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a context to their types. The model is not an arbitrary cartesian closed cate- 
gory but a presheaf over a category W of weakenings with contexts as objects. 
Roughly speaking weakenings are projections on Rnite sequences of variables 
and a presheaf can be seen as a variable set or a proof relevant Kripke struc- 
ture. In the interpretation of objects depend on as well as on the 

morphisms of W (cf. the implication in a Kripke structure). In their proof they 
make use of a category that has both, a semantical and a syntactical component, 
called glueing model. They also describe a correspondence between intuitionistic 
completeness proofs and normalization. 

CuBRic, Dybjer and Scott [7] have a similar aim, namely to give a normal- 
ization proof with as little syntactic properties as possible. They also describe 
terms as morphisms and use a presheaf model over these terms, but here cate- 
gories are equipped with a partial equivalence relation. So the function [ (called 
quote in [1,5,7]) appears as natural isomorphism between the YoNEDA embed- 
ding and an interpretation, that interprets atoms by YoNEDA. The function 
quote and its inverse are used when the universal property of the category of 
terms is shown. In the appendix they investigate some A-theories, but it seems 
that these are too restricted for practical purposes. 

CoQUAND and Dybjer [5] also interpret terms in a model and reconstruct 
the normal forms via the function quote. The A-calculus is given by S- and K- 
combinators, so they get slightly different normal forms. The model is a glueing 
model, therefore it is not extensional and application is application of the se- 
mantic part. They emphasize that they use an intuitionistic metalanguage and 
have implemented the algorithm in Ale. 

Danvy [8] successfully uses the J, and f functions for partial evaluation, in 
particular for compiler construction. His setting is the two-level A-calculus. 



5.2 Comparison of algorithms 

We tested different ways to normalize simply typed A-terms. The first is normal- 
ization by evaluation using an internal eval function of the given programming 
language. The second is normalization by evaluation with a user defined eval 
function. The third is normalization by means of a user defined /3-conversion. In 
situations where application is not the usual one, e.g. in presheaf models or in 
glueing models, and for typed programming language like Standard ML, whose 
eval is not accessible to the user, the first way is excluded. 

To test the different normalization algorithms we used iterated functions, i.e. 
M„m '■= (it]]it]]j)(AT°T)^, where := A/*+^At*(/ ...(/*)... ) with n occur- 

rences of /. Here the type 0 is any ground type and i + I := i ^ i. So it]j is of 
type i and M„m is of type 1 with \x°x as its normal form. The point in these 
examples is that the result is small (always \x°x), but due to iterated function 
applications it takes many steps to reach the normal form. The table shows that 
normalization by evaluation with an internal evaluation is about twenty times 
faster than the version with self-made evaluation, and this again is faster than 
a recursively defined normalization. The run-times are given in seconds resp. 
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minutes and seconds: 





normalization 
by an internal 
evaluation 


normalization 
by a dehned 
evaluation 


recursive 

normalization 


M45 


0 


1 


3 


(W55 


0 


2 


14 


(W56 


0 


6 


stack overflow 


Mee 


2 


36 




Mer 


4 


1:27 




M76 


10 


3:34 




Afrr 


31 


10:13 




Mrs 


1:16 


25:28 




Mss 


10:12 


207:59 





Now we give the main definitions of an implementation in Scheme. We restrict 
ourselves to the case of closed terms since by A-abstraction we can bind all free 
variables. First our J, and |-functions: 



(define (reify type) 

(if (ground-type? type) 

(lambda (x) x) 

(let ( (ref lect-rho (reflect (arg-type type))) 

(reify-sigma (reify (val-type type)))) 

(lambda (a) 

(lambda (k) 

(let ((xk (mvar k))) 

(abst xk ((reify-sigma 

(a (ref lect-rho (lambda (1) xk)))) 
(+ k 1))))))))) 



(define (reflect type) 

(if (ground-type? type) 

(lambda (x) x) 

(let ((reify-rho (reify (arg-type type))) 

(ref lect-sigma (reflect (val-type type)))) 
(lambda (r) 

(lambda (a) 

(ref lect-sigma 
(lambda (k) 

(app (r k) ((reify-rho a) k))))))))) 

Normalization with internal evaluation: 



(define (evl x) (eval x (the-environment) ) ) 





136 U. Berger, M. Eberl, and H. Schwichtenberg 

(define (norml r rho) (((reify rho) (evl r)) 0)) 

Normalization with defined evaluation: 



(define (ev2 M) 

(lambda (env) 

(cond ((variable? M) (cadr (assq M env))) 

((application? M) 

(((ev2 (operator M)) env) ((ev2 (argument M)) env))) 
((abstraction? M) 

(lambda (a) ((ev2 (kernel M)) 

(cons (list (abstvar M) a) env)))) 

(else #f)))) 

(define (norm2 r rho) (((reify rho) ((ev2 r) ())) 0)) 



Finally a recursive normalization: 



(define (normS r) 

(cond ((variable? r) r) 

((application? r) 

(let ((op (norms (operator r))) 

(arg (norms (argument r)))) 

(if (abstraction? op) 

(let ((x (abstvar op)) 

(s (kernel op))) 

(norms (substitute s x arg))) 
(app op arg)))) 

((abstraction? r) 

(abst (abstvar r) (normS (kernel r)))))) 



The auxiliary dehnitions are not mentioned and hopefully obvious. So for in- 
stance mvar produces variables xj. from k, abst constructs a A-abstraction \xj~M 
from Xk and M , app constructs an application M N from M and N , and abstvar 
resp. kernel gets out x^ resp. M of \x^M . 
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Abstract. We extend a set of algebraic tools for representing micropro- 
cessors to model superscalar microprocessor implementations, and apply 
them to a case study. We develop existing correctness models to accom- 
modate the more advanced timing relationships of superscalar processors, 
and consider formal verification. We illustrate our tools and techniques 
with an in-depth treatment of an example superscalar implementation. 
We use clocks to divide time into (not necessarily equal) segments, de- 
fined by the natural timing of the computational process of a device. We 
formally relate clocks by surjective, monotonic maps called retimings. In 
the case of superscalar microprocessors, the normal relationship between 
‘architectural time’ and ‘implementation time’ is complicated by the fact 
that events that are distinct in time at the architectural level can occur 
simultaneously at the implementation level. 



1 Introduction 

In this chapter, we extend a set of algebraic tools for microprocessors 
[HT96, HT97, FH96] to model superscalar microprocessor implementa- 
tions, and apply them to a case study. In superscalar microprocessors, the 
timing of events in an implementation can be substantially different from 
that of the architecture that they implement. We develop the existing 
correctness models of [HT96, HT97] to accommodate the more advanced 
timing relationships of superscalar processors, and consider formal verifi- 
cation. We illustrate our tools and techniques with an in-depth treatment 
of an example superscalar implementation, first seen in a simpler form in 
[FH96]. 

We are particularly interested in models of time and temporal abstrac- 
tion. Clocks divide time into (not necessarily equal) segments, defined by 
the natural timing of the computational process of a device: for example, 
the execution of machine instructions, or some system clock. We formally 
relate clocks by surjective, monotonic maps called retimings. In the case 
of superscalar microprocessors, the normal relationship between ‘archi- 
tectural time’ and ‘implementation time’ is complicated by the fact that 
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events that are distinct in time at the architectural level can occur simul- 
taneously at the implementation level. 

Interesting recent work on pipelined microprocessors includes [WC94] 
on UINTA, a processor of moderate complexity, and its verification in 
HOL [GM93]; [MS95a, MS95b] on AAMP5, a more complex processor, 
and its verification in PVS [ORSS94]; and [BD94] on a fragment of the 
DLX architecture [HP96] . More recently, superscalar processors have been 
addressed: in particular, the increased complexity of verification in the 
face of complex timing behaviour [WB96, Bur96, SDB96, Cyr96]. 

The intuitive models used in both UINTA and AAMP5 are conceptu- 
ally similar to our own [HT96, HT97, FH96]. However, there are substan- 
tial differences, particularly in the approach to time, and timing abstrac- 
tion. The main focus of attention of related work is on the engineering 
realities of developing techniques to successfully address more complex, 
and impressive, examples (almost always in conjunction with specific soft- 
ware tools) . Our own work is concerned with developing a general formal 
framework for representing and verifying microprocessors within a uni- 
form and well-developed algebraic theory. 

In [WC94] systems are modelled as state streams: functions from time 
to state. Temporal and data abstraction functions are used to map be- 
tween different levels of abstraction. In earlier work [Win93], data and 
timing abstraction functions are separated (as in this paper). However, in 
[WC94], and related work on pipelined systems, they are combined. This 
is because the view is taken that the values of specification state compo- 
nents are distributed in time at the level of abstraction of the implemen- 
tation. For example, the value of a data register reg in an implementation 
may correspond with a specification state at time t, and the value of the 
program counter pc with a specification state t+n, where n pipeline stages 
are required for an instruction to progress from initiation to completion. 
We take the view that, rather than being temporally shifted, such state 
components are fundamentally different at the levels of specification and 
implementation. Consequently, we maintain a separation between data 
and temporal abstraction functions. 

The techniques of [MS95a, MS95b] derive from the earlier work of 
[BS90, SB91], in which specification and implementation are modelled 
as state sequences, but time is not explicitly present. To synchronise the 
specification and implementation state sequences, multiple copies of spec- 
ification states are inserted. In [MS95a, MS95b] a visible state predicate is 
introduced which identifies those implementation states that should cor- 
respond to a specification state. This approach is modified, in a manner 
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similar to that of [WC94] , to cope with pipelining by distributing data in 
time. Again, time is not explicitly present. A recent account of this and 
related work is [CRS97] . 

In [BD94] a simple three-stage ALU pipeline and a fragment of DLX 
are considered. Given a state Q or a pipelined implementation, a new 
state Q' is generated after executing one step of an instruction I. Both 
Q and Q' are then flushed by repeatedly stalling further execution (effec- 
tively, filling with no-ops). This results in two new states Qf representing 
the (flushed) pipeline and Q'j representing the (flushed) pipeline after 
executing instruction I- Qf and Q'j can be compared with appropriate 
specification states by projecting out the specification state elements. 
Note that there is no timing abstraction in this model: specification and 
implementation are both considered to take a single cycle to execute an 
instruction. This method is applicable if some mechanism for stalling the 
pipeline is available, which is generally the case. 

Also of interest is [Mel93] which again has a somewhat similar model of 
time. An injective, monotonic function fp maps abstract time to concrete 
time, and is defined in terms of a predicate P. If P{tc) for some concrete 
time tc, then there is an abstract time ta such that fp{ta) = A- Predicate 
P is required to be true at an infinite number of times. The map fp is 
similar to the immersion of Sect. 3.1. 

Other, earlier, work on microprocessors includes the following. Gor- 
don’s Computer [Gor83], since considered, in various forms, by others 
[Joy87, Sta93, HT97]. Viper [Goh87, Gul87], which has also been consid- 
ered in [ALL'*'93]. Landin’s SEGD machine [Lan63], considered in [Gra92, 
BG90]. The FM8501, a processor based on the PDP-11, and its more ad- 
vanced successor FM9001 are discussed in [Hun89, Hun92, Hun94, BJ93]. 

The structure of this paper is as follows. In Sect. 2 we introduce the 
basic iterated map model of a microprocessor. In Sect. 3 we consider how 
we may express the correctness of one model of a (non-superscalar) micro- 
processor with respect to another, at a different level of abstraction, when 
both are represented as iterated maps. In Sect. 4 we informally introduce 
the fundamental aspects of superscalar microprocessors. In Sect. 5 we 
consider how our correctness model from Sect. 3 must be modified for 
superscalar microprocessors. In Sect. 6 we introduce a simple machine 
architecture. In Sect. 7 we informally introduce ACS, a superscalar im- 
plementation of our simple architecture. In Sect. 8 we formalise ACS in 
detail. Although we include the majority of the formal representation of 
ACS, space considerations force us to omit certain parts. A full treatment 
can be found in [Fox98]. Finally, in Sect. 9, we consider the correctness 
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of ACS, and the problems of the formal verification of superscalar pro- 
cessors. 

2 Basic Models of Microprocessors 

In general, we model a microprocessor using an iterated map State, of the 
form: 



State : T x STATE ^ STATE, 

State{0, state) = init(state), 

State{t + 1, state) = next{State{t, state)). 

1. r is a copy of the natural numbers N, representing discrete time 
intervals, or eloek eyeles. 

2. STATE is the state-set of the microprocessor. Generally, this will be a 
Cartesian product of components representing registers, memory, etc. 

3. init : STATE ^ STATE is an initialisation funetion, that enforces 
internal consistency of the initial state of the microprocessor (e.g. 
given memory m, program counter pc and instruction register ir, we 
expect: ir = m{pc)) and acts as an invariant in formal verification: 
see Sect. 9. In the case of an architectural- level model, init will often 
be the identity function. Furthermore, when considering init as an 
invariant, it should be as weak as possible (Sect. 9). 

4. next : ST A TE STATE is the next-state funetion, determining state 
evolution. 

For simplicity, we have chosen to omit inputs and outputs as they 
are not needed in our case study. In practice, their inclusion causes no 
difficulty [HT97, Fox98]. 

The choice of T, STATE, init and next controls the level of abstrac- 
tion of the formal representation. For example, the choice of clock T 
controls the level of timing abstraetion. We can choose clock cycles of T 
to represent system eloek cycles, which would be appropriate in the case 
of a low-level representation of an implementation; or we could choose cy- 
cles of T to represent instruction execution, with each clock cycle lasting 
precisely one instruction. This latter choice (an instruetion eloek) would 
be more appropriate for a high-level, architectural description. Notice in 
the latter case that clock cycles would typically vary in length, since in 
general different instructions will have different execution times. Addi- 
tionally, the choice of STATE controls the level of data abstraetion. If 
we wish to represent a microprocessor at the architectural level, we will 
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choose STATE to represent those components visible to the program- 
mer. If we wish to represent an implementation, STATE will additionally 
include components not visible to the programmer (e.g. buffer registers, 
eaehe memories, etc.). 

In addition to timing and data abstraction, we can consider struetural 
abstraetion, where a formal representation is sub-divided into component 
parts, representing a the physical structure of the implementation. We 
may partition, or deeompose, state-set STATE, iterated map State and 
next-state function next to reflect both the physical partitioning of the 
microprocessor, and the conceptual sub-tasks that must be performed in 
instruction execution. 

We may consider many different levels of abstraction when model- 
ing microprocessors. However, we will restrict our attention to two. The 
programmer’s model PM, corresponding to the user- visible architectural 
level, and the abstraet eireuit model AC, corresponding with a high-level 
view of the implementation, commonly called the organisation. 

3 Simple Correctness Models 

Given two descriptions of microprocessors 

StatepM : T x ST ATEpm — > ST ATE pm , 

StateAC ■ S X STATE AC ^ STATE ac, 



how do we formulate the statement: 

State AC correctly implements State pm^'^ 

3.1 Retimings 

First, we must consider how we can relate times on two different clocks. 
Given two clocks T and S, a function X : S ^ T is called a retiming 
if it is: (i) monotonic, ensuring time always runs forwards on T and S] 
and {ii) surjective, ensuring that each time t € T corresponds with some 
time s G S. We denote the set of all such retimings by Ret{S,T). In 
the case of microprocessors, we can construct state- dependent retimings 
A : STATE Ret{S,T) that are functions of the state-set of a micro- 
processor representation. For example, in the case that T represents an 
instruction clock, and S a system clock, then A would map times on S 
to the time on T corresponding to the execution of the current machine 
instruction. A simple retiming is illustrated in Fig. 1. 




Algebraic Models of Superscalar Microprocessor Implementations: A Case Study 143 



We can build a number of formal tools based on retimings. In this 
paper, we require the following. 

1. The immersion: X : T ^ S, defined by 

A(r) = {least t)[X{t) = r]. 

2. The start function start : Ret{S,T) ^[5^5], defined by 

start{X){s) = AA(s). 

Further discussion of retimings can be found in [Har89, HT90, HT93, 
HT96]. 




start ( ){G), start ( )(3) = 0 



Fig. 1. A simple retiming. 



3.2 Correctness Statement 



We construct a commutative diagram representing the correctness of 
State AC with respect to StatepM as follows: 

T X STATEpM STATEpM 



{\i^) 



V’ 



S X STATE AC STATE AC, 



where: 
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1. A : STATE AC Ret{S,T) is a state-dependent retiming, mapping 

the system clock and state of StatcAC to the instruction clock of 

State PM- 

2. if) : STATE AC STATEpm is a projection function, discarding the 

elements of State ac not visible in State pm- 

We say that State pm is correct with respect to State ac if the above 
diagram commutes, for all times s € S such that: 

s = start{X{state)){s)] 

that is, for all times corresponding with the end of a machine instruction, 
and for all states state G STATEac- 

4 Superscalar Microprocessors 

A pipelined processor allows new instructions to commence before their 
predecessors have finished execution. For example, instruction i may be 
fetched while instruction i— 1 is being decoded, i— 2 is being executed, and 
i — 3’s result written back to registers or memory. Clearly, it is necessary 
to ensure that the relationships, or dependencies^ between instructions 
permit this (see, for example, [Sto93, Joh91]). 

Pipelined processors permit several instructions to be in different 
stages of execution simultaneously. Superscalar processors extend this by 
allowing more than one instruction to be processed at each stage. Several 
instructions may start execution simultaneously, and finish together, or 
even out of program order. To achieve this the processor must contain 
multiple pipeline units at each stage. Machines capable of parallel in- 
struction execution have existed since the 1960s. For example, the CDC 
6600 [Tho61], and the IBM 360/91 [AST67]. These machines were not 
superscalar, though they did contain parallel functional units: necessary 
because of the advanced pipelining techniques they used. The IBM 360/91 
used Tomasulo ’s algorithm for the issuing logic [Tom67] , commonly used 
in modern superscalar microprocessors for scheduling instruction execu- 
tion. Superscalar processors first appeared in the late 1980s: for example 
the IBM RISC System/6000 [Gro90]. Subsequent superscalar processors 
include, among others: Sun Microsystems Ultra SPARC-T, IBM/Motorola 
PowerPC 620; Intel Pentium and Pentium Pro; MIPS/NEC RIOOOO; and 
DEC Alpha 21064 and Alpha 21164- 

The level of parallelism achievable by superscalar (and pipelined) pro- 
cessors is determined by the dependencies between instructions [Sto93, 
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Joh91]. There are five types of dependency: the first three apply to both 
superscalar and pipelined processors. 

1. Data dependency occurs when an instruction needs the result of a 
preceding instruction. 

2. Procedural dependency occurs when a branch instruction disrupts 
the normal (sequential) flow of execution; requiring any work done on 
subsequent instructions to be discarded. This can be a particularly 
significant source of delay in the case of conditional branches, where 
the outcome may not be known until late in the pipeline. 

3. Resource conflicts occur when two instructions simultaneously require 
the same hardware resources (functional units, etc.). This problem 
can often be reduced by duplicating hardware, at a cost. For example, 
providing two addition units removes the resource conflict between a 
pair of add instructions. 

The remaining two dependencies apply only to superscalar processors. 

1 . Antidependency occurs when an instruction overwrites the arguments 
of a preceding instruction. This is significant if instructions are allowed 
to execute out of order. 

2. Output dependency occurs when two instructions wish to store results 
at the same destination, which, again, is significant if instructions are 
allowed to execute out of order. 

These dependencies must be resolved if the processor is to func- 
tion correctly. That is, to be functionally indistinguishable from a non- 
superscalar sequential arehitecture model. In a sequential architecture, 
each instruction is assumed to finish before its successor. This is a nat- 
ural model for instruction-level computation at the PM level of abstrac- 
tion. Furthermore, it is also the model used by older implementations of 
currently-popular architectures. In order to preserve the natural model, 
and maintain backward compatibility, it is necessary for superscalar pro- 
cessors to retain, or be able to reconstruct, a state which corresponds 
exactly with a state of the PM level. A precise architectural state is a 
processor state which meets this condition. A processor can generate pre- 
cise states if dependencies are correctly resolved and results are written 
to PM-level state components in program order. This is particularly im- 
portant in the case of exceptions. For example, if instruction i causes an 
exception because of an error, it is reasonable to expect that instruction 
i + 1 has not yet executed. 

There are a number of techniques used to maximise throughput, whilst 
observing dependencies and maintaining a precise architectural state. The 
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most common method is some form of register renaming^ where additional 
registers are used to resolve instruction dependencies, and to temporarily 
store results. A common form of register renaming is a reorder buffer. 
This consists of a circular buffer of registers, used to (temporarily) store 
the results of computations before they are are eommitted, or retired., to 
PM-level state components. Whenever an instruction is dispatehed for 
execution, the next available slot in the buffer is reserved to store the 
result. Results are inserted into the buffer, in the appropriate location, 
when they become available. They are then removed from the head of 
the buffer, and are stored in architectural components, in program order. 
For example, suppose instructions i, i + 1 and i + 3 generate results at 
time t, but that i + 2 has yet to finish. The results of i and i + 1 can be 
transferred to PM-level components, but i + 3 must wait until i + 2 has 
also finished. This ensures a precise architectural state: if the machine is 
interrupted at this point, execution will restart with instruction i + 2, and 
i-|-3 will be re-executed. In order to speed up execution, implementations 
generally permit bypassing. That is, results in the reorder buffer can be 
used by subsequent instructions (subject to dependencies) before they are 
moved to the relevant PM-level components. 

5 Superscalar Correctness 

The correctness model in Sect. 3 is applicable to simple, mieroprogrammed 
implementations, and to more complex pipelined implementations, where 
each system clock cycle corresponds with the termination of at most one 
machine instruction. In a superscalar implementation, it is possible for 
multiple instructions to terminate on a single clock cycle. That is, there 
may be cycles of clock T that correspond with no cycle of clock S. Hence 
there is no retiming from S to T. To solve this problem, we introduce 
a new retirement eloek R. Cycles of clock R mark the committal of one 
or more machine instructions. We can construct two retimings \\ :T ^ 
R, mapping instruction clock cycles to retirement clock cycles, and A 2 : 
S ^ R, mapping system clock cycles to retirement clock cycles. We can 
construct the map p : S ^ T from system clock cycles to instruction 
clock cycles by composition: 



p{t) = X\X2{t). 

Function p is illustrated in Fig. 2. Note that p is not a retiming since 
it need not be surjective. If, for example, instructions i and i + 1 com- 
mit simultaneously, it is not meaningful to talk about the correctness of 
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State AC after instruction i, since there is no time at which instruction i 
has terminated, and instruction i + 1 has not. 



7 0 1 





Fig. 2. Retimings from T to R and from S to R. 



6 PM: A Simple Architecture 

To illustrate our algebraic tools, we will introduce a simple machine archi- 
tecture PM, consisting of: separate data and instruction memories; a set 
of registers; and five instructions, with the following informal meanings. 

1. add rega, regb, regc — Add register rega to register regb, store the 
result in register regc and increment the program-counter. 

2. branch addr — If register rego is zero then add the program-counter 
to addr and store the result in the program-counter; otherwise incre- 
ment the program-counter. 

3. load rega, addr — Load the contents of the data memory, at location 
addr, into register rega, and increment the program-counter. 

4. store rega, addr — Store the contents of register rega in the data 
memory at location addr, and increment the program-counter. 

5. set rega, val — Store the constant val in register rega, and incre- 
ment the program-counter. 

We will first describe the state algebra of PM, defining the state-space, 
clock and state function. We then describe the next state algebra, defining 
the initialisation and the next-state function. 

The PM state algebra is as follows: 
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Algebra PM State 
Carrier Sets 

T, STATEpM 
Constants 

Operations 

StatepM ■ T X ST ATE pm — > ST ATE pm 

End Algebra. 

The PM state function StatepM is defined below 

StatepM{0, state) = state, 

StatepM{t + 1, state) = nextpM{StatepM{t, state)), 

where state is a tuple of type STATEpm- We assume all states are valid 
initial states for time zero: hence there is no need for an initialisation 
function. Each cycle of clock T corresponds with one machine instruction. 
The state-space of the architecture is 

STATEpM = Mem x Mem x PC x Reg. 

The various components, and subcomponents, of STATEpm are defined 
as follows. 

RC = Wa (where a G N'*' is the number of PM registers), 

PC = Wb (where 2^ G N”*" is the number of PM mem- 
ory addresses), 

R = ITc (where c G N"*" is the PM register width), 

OP = IT 3 (where OP is the operation code field of an 
instruction, 

Reg=[RC R], 

Mem = [PC ^ R], 

and Wn = Bif^ (n G N+) is the set of n-bit words {Bit = {0, 1}). 

The typical state element will be of the form {mp,md,pc,reg) G 
STATEpm where: mp G Mem is the program memory; md G Mem is 
the data memory; pc G PC is the program-counter; and reg G Reg is the 
set of registers. 

The PM next-state algebra is as follows: 
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Algebra PM Next-State 
Carrier Sets 

T, STATEpM 
Constants 
0 G r 

Operations 

■ + l:T 

next PM • ST ATE PM — > ST ATE pm 

End Algebra. 



The function • 1 is the successor clock cycle function which enumer- 

ates time from the initial cycle 0 G P. The next-state function next pm is 
defined as follows 

next PM {mp, md, pc, reg) = 

' {mp, md,pc -|- 1, if op{ins) = add; 

reg[regra{ ins) + regrb{ins)/rc{ins)\), 

{mp, md, pc + addr{ins) , reg) , if op{ins) = branch 

and rego = 0; 

< {mp,md,pc+ l,reg), if op(ins) = branch 

and rego / 0; 

{mp,md,pc + l,reg[md{addr{ins))/ra{ins)]),\i op{ins) = load; 

{mp,md[reg,^a{ins) / ^ddr{ins)],pc + l,reg), if op{ins) = store; 

^ {mp,md,pc + l,reg[pad{val{ins)) / ra{ins)]) , if op{ins) = set. 

where ins = mp{pc) is the next instruction to be executed. There are 
six cases; one for each type of instruction with the exception of branch, 
which has two cases (taken and not taken). The generic notation r[v/a] 
is an abbreviation for functions of the form: 



sub : [A ^ V] X V X A ^ [A ^ V], 



sub{r, V, a){i) 



V, if a = V, 

r{a), otherwise. 



The functions op : R ^ OP, ra, rb, rc : R ^ RC , and addr,val : 
R PC extract the op-code, register and address/value fields of an 
instruction, respectively. 

The definitions of op, ra, rb, rc, addr and val are omitted. The pad 
function simply extends bit strings by padding with zeros. The definition 
is omitted. 
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7 ACS: An Informal Description 

We introduce an implementation ACS of PM that will include component 
parts, typical of a superscalar microprocessor, in a somewhat simplified 
form. We will permit out-of-order instruction issue and execution^ and 
will enforce in-order instruction retirement by means of a scoreboarding 
algorithm, implemented using a re-order buffer, thus preserving a precise 
architectural state. We will permit a maximum of four instructions to 
execute simultaneously, dependencies permitting. 

The implementation presented lacks some features, such as register re- 
naming, present in many superscalar machines (see for example [SS95]). 
Instructions can execute out-of-order and are committed in-order. Thorn- 
ton’s algorithm is used to resolve dependencies instead of the more com- 
plex Tomasulo’s algorithm [WS84]. In addition, we do not permit bypass- 
ing of results from the reorder buffer. However, our example does have the 
essential property of a superscalar implementation: the ability to commit 
more than one instruction in a single cycle. The intention is to present 
a pedagogical example, that is not obscured by unnecessary complexity. 
None of the omitted features would be difficult to introduce. 

First we describe the physical structure of the microprocessor and its 
component parts. Then we consider the conceptual opemtions performed 
by groups of physical components, operating together. We discuss the 
relationship between physical components and conceptual operations in 
Sect. 8. 

7.1 Processor Organisation 

The ACS processor consists of the eight physical units shown in Fig. 3. 

1. The Instruction Cache stores machine instructions, performing the 
same function as the PM program memory. For simplicity, we will 
assume that the instruction cache is the same size as the PM program 
memory. In reality, it will be much smaller, and an attempt to fetch 
an instruction not in the cache will cause a cache miss. This will 
cause a block of memory containing the missing instruction to be 
fetched into the instruction cache. Steps will need to be taken to decide 
which cache replacement strategy to use, and the operation of the 
processor may stall while the new instruction is fetched. Modeling this 
algebraically causes no difficulties. However, in ACS, it will complicate 
the representation in an unhelpful way, and is hence omitted. 
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2. The Instruction Buffer contains a fetch program counter and a buffer. 
The buffer stores a small number of instructions, fetched from the in- 
struction cache using the fetch program counter. The buffer maintains 
a reserve of instructions available for processing in the event of a cache 
miss. We have eliminated cache misses in ACS, but the instruction 
buffer is a necessary part of a superscalar processor, and hence we 
have retained it. 

3. The Decode Unit breaks down instructions into component parts: op- 
eration codes, and operand specifiers. The ACS processor has a very 
simple, fixed instruction format, and hence the decode unit is very 
simple. Real processors, with more complex instruction sets, will gen- 
erally require a more sophisticated decode unit. 

4. The Issue Unit contains four reservation stations [Tom67], one for 
each functional unit. These schedule program execution, by deter- 
mining dependencies, and then forward instructions that are free to 
execute to the appropriate functional units. 

5. The Functional Units compute results based on instructions and their 
operands. There are two adders (allowing add instructions to execute 
in parallel), a load-store unit and a branch unit. 

6. The Reorder Buffer Unit stores instruction results prior to their com- 
mittal to PM-level components (data cache, registers and program 
counter). Results from instructions that finish execution out of pro- 
gram order are held in a buffer until their predecessors have finished 
execution, at which point they are written back {committed) to the 
appropriate PM-level components. 

7. The Register Unit contains PM registers, PM program counter, and 
a reset flag for clearing the pipeline. It also contains an array of bits 
indicating which registers are currently in use, and a list of memory 
locations currently in use. 

8. The Data Cache performs the same function as the PM data memory. 
As with the instruction cache, the data cache is assumed to be the 
same size as the PM model data memory, eliminating cache misses. 
Again, there is no difficulty in modeling a smaller cache algebraically; 
though we would have to deal with the additional question of the 
cache write strategy, since, unlike the instruction cache, it is possible 
to write to the data cache. 

We briefly describe each unit below. Each of these units are involved 

in one or more processor operations. 
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Fig. 3. A simple organisation. 



7.2 Processor Operations 

There are six operations involved in instruction execution, outlined below. 

1 . Instruction Fetch. Instructions are read from the current fetch program 
counter value, which attempts to predict the future evolution of the 
PM program counter, assuming there are no branches taken. Clearly, 
any taken branches will render subsequently-fetched instructions in- 
valid. To simplify ACS, branches are not treated differently: that is 
branches are predicted as not taken. Correctly predicating branch des- 
tinations significantly improves performance, because branching to the 
‘wrong’ destination requires the pipeline to be emptied. In practice, 
it may be more effective to predict branches to be taken; and main- 
taining a history of previous behaviour would certainly be better. 
Modifying ACS to accommodate either of these (or other) strategies 
would present no difficulties. 

2. Instruction Decode. Instructions are decoded into five fields: an op- 
code, three register indexes and an address/value field. Not all fields 
are required for all instructions. 

3. Instruction Dispatch and Issue. Instruction dependencies are resolved 
at run-time before issue, so that the reorder buffer can always recover a 
precise architectural state. There are a number of alternative methods 
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available. ACS uses Thornton’s algorithm for the issue logic [Tho70] . 
When resources are available instructions are dispatched (added) to 
the end of the appropriate reservation station’s buffer, and a place is 
reserved in the reorder buffer to hold the result. Instructions can then 
be issued (removed), in any order, from the reservation station, when 
all dependencies are resolved. 

4. Instruction Execution. The implementation contains two adders, a 
load-store unit and a branch unit, for executing: add and set; load 
and store; and branch instructions respectively. Two add instructions 
may be executed simultaneously, or one add and one set. 

5. Instruction Reorder. Instruction results (from the functional units) 
are inserted into their pre-reserved slots in the reorder buffer. 

6. Instruction Committal Once results reach the top of the reorder buffer 
they are removed and committed to PM state components (register, 
program counter or data cache). This means that the PM state com- 
ponents always contain a precise architectural state, because instruc- 
tion i is not allowed to commit until all instructions i' , i' < i have 
generated results in the reorder buffer, and have themselves commit- 
ted. 

8 ACS: A Formal Description 

We will formally represent ACS as follows. 

1. We describe the state algebra of ACS in Sect. 8.1. This algebra defines 
the state-space, AC clock and state function of ACS. 

2. We then describe the next state algebra in Sect. 8.2. This algebra 
defines the initialisation and next-state functions of ACS. There is 
one next-state function for each of the physical units in ACS. 

3. Finally, we describe the processor operation algebra in Sect. 8.3. Each 
of the next-state functions for the units described in Sect. 8.2 is defined 
in terms of processor operations, for example, instruction decode. Each 
operation will typically affect a number of processor units. 

The relationships between processor operations and units is shown in Eig- 
ure 4. Operations are represented by rectangular boxes and machine units 
are represented by rounded boxes. Eor example, the operation Execute 
affects the issue unit (from which instructions are removed) and the func- 
tional units (where instruction results are computed). Execute receives 
input from the issue unit, the register unit and the data cache. 
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Fig. 4. Dependencies of the microprocessor operations. 



8.1 The State Algebra 

The state algebra for the ACS processor consists of carrier sets for time S 
and the state of the processor STATE, and the state function State 
which returns a new state of the processor, given a time and an initial 
state. 

Algebra ACS State 
Carrier Sets 
S, STATE 
Constants 

Operations 

State ACS • ^ ^ STATE STATE 

End Algebra. 

The state function is defined below. 

State state) = init j^(jg{state) , 

State + 1, state) = nextj^(jg{Statej^(jg{s, state)). 
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The hidden function nextj^(jg : STATE ^ STATE is defined below. 

nextj^(jg{state) = {Icache, Ibuffer (state) , 

DecodeDisp.{state) , I ssue(state) , Functional (state) , 
Reorder (state), Register s(state) , Dcache(state)) . 

The initialisation function initj^(jg, and next-state functions for the ACS 
units Ibuffer, DecodeDisp., Issue, Functional, Reorder, Registers and 
Dcache are defined in Sect. 8.2. 

8.2 Processor Initialisation and Next-State Functions 

The next-state algebra for ACS consists of carrier sets for time S and 
the state of the processor STATE, together with carrier sets for individ- 
ual units (the Cartesian product of which constitutes the overall state 
set STATE). The operations in the next-state algebra are the successor 
function for time, the initialisation function for the processor initj^Qg, 
and the individual next-state functions for each of the units. 

Algebra ACS Next-State 
Carrier Sets 

S, STATE, Ibuffer, Decode, Issue, Functional, ReorderBuf, 
Registers, Dcache 
Constants 

0 G 5 

Operations 

s+l:S^S 

initACS ■ STATE STATE 

Ibuffer : STATE Ibuffer 

DecodeDisp. : STATE Decode 

Issue : STATE Issue 

Functional : STATE Functional 

Reorder : STATE ReorderBuf 

Registers : STATE Registers 

Dcache : STATE Dcache 

End Algebra. 

The components of this algebra are defined in the following sections. 

Processor State-Space and Clock. The machine state-space is made 
up of Cartesian product of the state-spaces for each of the eight main 
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ACS units. 

STATE = Icache x Ibuffer x Decode x Issue x Functional 
X ReorderBuf x Registers x Dcache. 

A typical vector will be of the form. 

state = {Icache, ibuffer, deeode, issue , funetional , 
reorderbuf, registers, Dcache) . 

Figure 5 gives a pictorial representation of the AC state of the processor. 

The state-space of ACS is hierarehieal in that each physical unit has 
its own state, which is constructed of simpler state sets, that may in 
turn be made up of yet more primitive components. The full name of 
a state component is of the form x\ ■ X 2 ■ X 3 ■ ■ ■ Xn- For example, issue ■ 
Addlssuel • regopndi ■ ready is the ready bit of the first operand, of the 
first adder reservation station in the issue unit. To simplify the definitions, 
and where no ambiguity results (the vast majority of cases), we will omit 
name elements. 

The state-space of ACS makes heavy use of buffers and lists. We 
assume the existence of a general-purpose finite buffer algebra, and a 
general-purpose finite list algebra, which we will not define here: see 
[Fox98] . In general, BufFer(f,„/^j 2 e, Data) and are the sets of fi- 

nite buffers and lists respectively, of size bufsize containing elements from 
the set Data. We will informally introduce buffer and list operations as 
required. 

In Figure 5, we imply concrete buffer and list operations, based on reg- 
ister files, with head and tail pointers (and usage bits in the case of lists). 
In the case of the Dispatch only (Sect. 8.3), we require such a concrete 
definition. Given we are specifying hardware, this is not unreasonable. 

The processor state-space is parameterised by eight constants, defining 
the sizes of the buffers and reservation stations in the processor: 

ibufsize = instruction buffer entries; 
deesize = decode unit entries; 
addsizel = reservation station entries for the first adder; 
addsize2 = reservation station entries for the second adder; 
brsize = reservation station entries for the branch unit; 

Isrsize = reservation station entries for the load-store unit; 
reordersize = reorder buffer entries; and 

memusize = memory address usage entries. 




D-Cache 
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Fig. 5. The main components of the ACS processor. 
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These constants would be instantiated for any given concrete implemen- 
tation of the processor and, in practice, are limited by technological con- 
straints. 

The clock S synchronises the processor’s operations and is not a mea- 
sure of absolute time: that is, it is not a system clock. The clock rate of an 
implementation of ACS would be determined by the minimum speed of 
the processor units: the maximum time taken for the slowest unit to reach 
its next state. It is also assumed that all of the processor’s functional units 
will compute results in one clock cycle. In a real processor, each of the 
individual processor units may have their own pipelines. Typically, func- 
tional units for, say, floating point operations may take some cycles to 
compute. We have chosen to ignore some technological realities in order 
to simplify the example. For example, no limit is placed on the number 
of instructions that can be committed in any one cycle. In practice, bus 
width and cache/memory bandwidth would impose a limit of (currently) 
a few instructions per cycle. 

Processor Initialisation. A strong definition for the processor initiali- 
sation function initj^(jg is: 

initj^(jg{state) = {Icache, ibujjer, decode, issue , functional , reorderbuf, 
{reg, regu, memd, 1 , pc ) , Dcache) . 

The function above simply sets the reset flag (in the register unit) to 
one, effectively emptying the pipeline. While simple, this definition of ini- 
tialisation causes problems when considering the correctness of ACS: see 
Sect. 9. Ideally, we would like the weakest possible initialisation function; 
that is, one which only modifies the initial state if it is internally inconsis- 
tent. Unfortunately, given the complexity of ACS, such an initialisation 
function is difficult to define: see Sect. 9 and [Fox98]. 

The Instruction Cache. The state of the instruction cache is Icache = 
[PC Reg]. Since we do not permit modifications to the program, the 
instruction cache does not change state. Hence there is no next-state 
function. 

The Instruction Buffer. The state of the instruction buffer is: 

I buffer (ibufsize,Reg) ^ i 

with component names: 

ibujjer = {instbuj, fetchpc). 
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The fetch buffer instbu} stores instruction words fetched from the instruc- 
tion cache. The register fetchpc holds the address of the next instruction 
to be fetched, ‘guessing’ the future value of the PM program counter by 
assuming the absence of branches. 

The next-state function Ibuffer : STATE ^ Ibuffer is defined below. 

Ibuffer (state) = 

IBuffMrg(-K?^T^(Fetch{I cache, ibuffer)), if reset = 0; 

T^ff^££tf(£)gcode{ibuffer, deeode))), 

ibuffer 

(Fetch(I cache, < Reset(instbuf),pc >)), otherwise. 

If the pipeline is not being reset then instruction buffer entries at the 
head of ibuffer are removed by the operation Decode : Ibuffer x Decode ^ 
Ibuffer X Decode (Sect. 8.3), and new ones inserted at the tail of ibuffer 
by the operation Fetch : Icache x Ibuffer ^ Icache x Ibuffer (Sect. 8.3). If 
the pipeline is being reset (i.e. after a branch), then the instruction buffer 
is cleared with the buffer operation Reset, which empties the buffer. It 
is then filled with entries fetched from the instruction cache using the 
current PM program counter value pc, rather than the fetch program 
counter fetchpc. Note that < Reset(instbu}) , pc > is a tuple of type 
Ibuffer. We use < • • • > to identify tuples, here and elsewhere, to avoid 
confusion over the numbers and types of arguments of functions. 

The projection functions : Ibuffer x Decode ^ Ibuffer and 

ibuffer 

^ F _ etch^ . |(-g(-p|g X Ibuffer ^ Ibuffer are defined as follows: 

ibuffer 

decode) = ibuffer, 

(Icache, ibuffer) = ibuffer. 

ibuffer 

In general, projection functions of the form vr^ project out the state ele- 
ment named x from some tuple represented by y, where y may either be 
state element, or a function returning a state element. We will omit the 
definitions of such projections from now on. 

The hidden function IBuffMrg : Ibuffer x Ibuffer ^ Ibuffer combines 
the results of instruction decode and instruction fetch. 

IBuffM rg ( instbuj ^ , fetehpe ^ , instbuf > fetehpc 2 ) = 
(Merge(instbuj i, instbuf ) ) fftehpci) . 

The fetch program counter is affected by the Fetch operation but not by 
the Decode operation; hence fetchpc 2 is discarded. Instruction buffers 
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instbu} I and instbuj 2 combined using the buffer function Merge, 
which concatenates the contents of two buffers. 

The Decode and Dispatch Unit. The state of the decode and dispatch 
unit is: 

Decode = Buffer (^decsize ,DecEntry) ? 
where 

DecEntry = OP x RC^ x PC, 
with component names: 

deeentry = {op, ra, rb, rc, addr). 

The fields correspond with op-code, registers and address/value field re- 
spectively. 

The next-state function DecodeDisp. : STATE Decode is defined 
below. 

DecodeDisp. {state) = 

Merge{TT^PS^^{Decode{ibuffer, deeode)), if reset = 0; 

{Dispatch{ deeode , issue, reorderbuf, 

) decode 

registers , D cache ) ) ) , 

^ Reset{deeode) , otherwise. 

If the pipeline is not being reset then entries are dispatched to the issue 
unit Issue by the operation Dispatch : Decode x Issue x ReorderBuf x 
Registers ^ Decode x Issue x ReorderBuf x Registers (Sect. 8.3), and de- 
coded from the instruction buffer IBuffer by Decode, with results com- 
bined by Merge . If the pipeline is being reset then the decode and dispatch 
unit is emptied by the buffer operation Reset. 

The Issue Unit. The state of the issue unit is: 

Issue = Addlssuel x Addlssue2 x Brissue x Lsrissue, 
where 

Addlssuel List(^o,(^^gj 2 :elAddEntry)) 

AddlsSUe2 List^o,(^(;;gj2e2 Add Entry); 

Brissue List^f^-^j^e BrEntry); 

Lsrissue List^;^j.^j 2 :e,LsrEntry); 
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and 

AddEntry R6g0pnd X RC X ^^\o^^(^reordersize)i 
BrEntry Bit x Rcf] x ^ PC 

^ ^^log2 (reordersize ) ) 

LsrEntry = RegOpnd x MemOpnd x PC x RC x Bit 

^ ^^log2 (reordersize ) ) 

RegOpnd = Bit x RC x Reg, 

MemOpnd = Bit x PC x iiet/, 
with component names: 

issue = {addissuei, addissue2, brissue, Isrissue), 
where 

addentry = {regopndi, regopnd2, dr, reorderpos), 
brentry = {ready, opnd, offset, addr, reorderpos) , 

Isrentry = {regopnd, memopnd, da, dr, load, reorderpos), 
and 

regopnd = {ready , sr, opnd), 
memopnd = {ready, sa, opnd). 

The issue unit contains reservation stations for each of the functional 
units: Addissuei, Addlssue2, Brissue and Lsrissue. Each reservation sta- 
tion is modeled using the finite list algebra. An adder reservation station 
entry consists of two register operand entries {regopnd^ and regopnd2), 
a destination register dr, and an assigned location in the reorder buffer 
reorderpos. 

The register operand entries each consist of a ready flag ready, a 
register location sr, and a machine word opnd. If the flag ready is set 
then the operand will already be stored in opnd, otherwise it will be 
fetched later from register location sr. 

The branch unit reservation station consists of a flag ready, a machine 
word opnd, a reorder buffer offset offset, a memory address addr, and an 
assigned location in the reorder buffer reorderpos. If the flag ready is set 
then opnd will contain the current contents of register zero. Otherwise, 
this will be fetched later. The offset field is used to ensure the pc-relative 
branch is made to the correct address. If the reorder buffer is not empty 
when the branch is dispatched then the PM program counter value upon 
dispatch will be inconsistent with the branch instruction address. The 
offset field compensates for this. 
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A load-store reservation station entry consists of a register operand 
regopna (identical to an adder reservation station’s operand entry), an 
address operand memopnd, a memory destination address da, a destina- 
tion register dr, a flag load, and an assigned location in the reorder buffer 
reorderpos. The address operand memopnd consists of ready flag ready, 
a data-cache memory location sa and a machine word opnd. If the ready 
flag is set, then the operand to be loaded/stored is already present in 
the opnd field of memopnd or regopnd as appropriate. Otherwise, it is 
fetched later from register location sr (in regopnd) in the case of a store 
instruction, or data cache location sa (in memopnd) in the case of a load 
instruction. The flag load is set when a load instruction is to be executed. 
The next-state function Issue : STATE Issue is defined below. 

Issue(state) = 

' IssueRmv(< TX^—Il‘^^'^^(Dispatch{deeode, issue, if reset = 0; 

^ i ssue 

reorderbuf , register s,D cache)) >, 

< < nexecs{Execute{issue, registers, D cache)) >), 

{Reset{addissuei) , Reset{addissue 2 ) , Reset(brissue) , otherwise. 

Reset{lsrissue)) , 

If the pipeline is not being reset then entries are pushed onto the 
reservation station by Dispatch and removed by the operation Execute : 
Issue X Registers x Dcache ^ x Functional (Sect. 8.3). 

The projection Hexecs '■ N4 X Functional ^ gives a 4-tuple of 
natural numbers identifying the instructions that have been issued from 
each of the four reservation stations {addissuei , . . . , Isrissue). The issued 
functions are then removed from the reservation stations by the hidden 
function IssueRmv : Issue x ^ Issue. 

IssueRmv{< addissuei, addissue 2 ,braneh,loadstore >> 

< exi, ex2, ex3, ex4 >) = 

{Remove{addissue\, ex\) , Remove{addissue 2 , ex 2 ) 
Remove{brissue, exz),Remove{lsrissue,ex/i)). 

Each reservation station has the entries of executed instructions removed 
by the list operation Remove, which removes a specified element from a 
list. 
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The Functional Units. The states of the functional units are: 

Adder^ x Branch x LoadStore, 

RcQ X RC X Bit X ^^\o^^{reordersize) ^ 

PC X Bit X Bit X ^^\o^^(^reordersize) 1 
Reg X PC X Bit X Bit X ^^\o^2Ce.ordersize)‘) 

{adderi, adder 2 , branch, loadstore), 

{result, dest, done, reorderpos), 

{result, taken, done, reorderpos) , 

{result, dest, load, done, reorderpos) . 

The adder functional units consist of a result word result, a destina- 
tion register address dest, a flag done, and a reorder buffer location 
reorderpos. The flag done is set when the unit has finished computing a 
result to be inserted into the reorder buffer. The branch functional unit 
consists of a memory address result, flags taken and done, and a reorder 
buffer location reorderpos. The flag taken is set when the condition for 
the branch holds. The flag done serves the same purpose as that in the 
adder functional units. The load-store functional unit consists of a result 
word result, a destination address dest, flags load and done, and a reorder 
buffer location reorderpos. The load flag distinguishes between load and 
store instructions; the done flag serves the same purpose as in the adder 
and branch units. 

The next-state function Functional : STATE Functional is defined 
below. 

Functional {state ) = 

^ E _ xecute , {Execute{issue, registers, Dcache)), if reset = 0; 

functional 

((0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0, 0)), otherwise. 

If the pipeline is not being reset then instructions are executed, otherwise 
all registers are set to zero. Note that clearing just the done flags would 
be sufficient to reset each functional unit, since this would prevent results 
from the functional units being moved to the reorder buffer (Sect. 8.3). 



Functional = 
where 
Adder = 
Branch = 
LoadStore = 

with component names: 

where 
adder = 
branch = 
loadstore = 



functional 
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The Reorder Buffer Unit. The state of the reorder buffer unit is: 

ReorderBuf ^^^^^'^(^reordersize, ReorderEntry): 

where 

ReorderEntry = x PC x Reg, 
with component names: 

reorderentry = {type, dest, word). 

The reorder buffer entry consists of a three-bit content-type word type, a 
destination word dest, and a result word word. Valid type values will be 
represented by unique constants wait, skip, dcache, reg and count. 

The next-state function Reorder : STATE ReorderBuf is defined 
below. 

Reorder {state) = 

Reorder Ins{M erge{7T^-^^^'^^^!^ ^{Dispatch{deeode, if reset = 0; 

^ reorderbuj 

issue, reorderbuf, registers, Dcache)), 

^ Commit ^ / (j reorderbuf, 

reorderbuf ^ 

registers, Dcache ) ) ) , funetional) , 

Reset{reorderbuf) , otherwise. 

If the pipeline is not being flushed, entries are removed from the buffer by 
the operation Commit : ReorderBuf x Registers x Dcache ^ ReorderBuf x 
Registersx Dcache (Sect. 3), place-holding entries are added (to hold future 
results) by Dispatch, and available results from the functionals units 
are inserted into their assigned entries by the operation Reorder I ns : 
ReorderBuf x Functional ^ ReorderBuf (Sect. 8.3). If the pipeline is being 
flushed then the reorder unit is cleared using the buffer operation Reset. 

The Register Units. The state of the register unit is: 

Registers = Reg x [RC Bit] 

X Listf^y, , . . PC) X Bit X PC, 

V » log2 (memusize) U ' 

with component names: 

registers = {reg, regu, memd, reset, pc) . 

This unit consists of the register array reg, a register usage table regu, a 
memory address usage list memd, a flag reset, and the program counter 
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pc. The components regu and memu are used to keep track of registers 
and memory locations that are to be written to by instructions that have 
been dispatched and have not yet committed. This information is used to 
resolve dependencies in instruction dispatch. 

The next-state function Registers : STATE Registers is defined 
below. 

Register s{state) = 

7r *^o™™\* (Commit(reorder6u/, if reset = 0; 

registers 

7T ^T‘^^° '- 'f ^{Dispatch{decode, issue, reorderbuf, 

J registers 

registers, D cache), Dcache))) 

^ {reg, emptyregu. Reset {memd),0, pc), otherwise. 

If the pipeline is not being reset, then the register usage table and memory 
usage list is updated by Dispatch. After this, all components may be 
altered by Commit. Dispatch and commit results are combined by simple 
composition. If the pipeline is being reset then the register usage table 
and the memory usage list are cleared, and the reset flag set to zero. 

The hidden constant emptyregu £ [RC ^ Bit] is defined by 

emptyregu{i) = 0, for all i G RC. 



The Data Cache. The state of the data cache is Dcache = [PC Reg], 
and the next-state function Dcache : STATE ^ Dcache is defined below. 

f ^ ommit{reorderbuf , if reset = 0; 

Dcache(state) = < registers, Dcache)), 

[ Dcache, otherwise. 

If the pipeline is not being reset then the data cache is only affected by 
Commit. If the pipeline is being reset, the data cache remains unchanged. 



8.3 Processor Operations 

The next-state functions for each of the physical units of ACS are defined 
in terms of conceptual operations. The six operation stages form the ACS 
Operations algebra, defined below. 
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Algebra ACS Operations 
Carrier Sets 

Icache, Ibuffer, Decode, Issue, Functional, ReorderBuf, 
Registers, Dcache 
Constants 

Operations 

Fetch 
Decode 
Dispatch 

Execute 
Commit 

Reorderins 

End Algebra. 



: Icache x Ibuffer ^ Icache x Ibuffer 
: Ibuffer X Decode ^ Ibuffer x Decode 
: Decode x Issue x ReorderBuf x Registers ^ 
Decode x Issue x ReorderBuf x Registers 
: Issue X Registers x Dcache ^ x Functional 
: ReorderBuf x Registers x Dcache ^ 
ReorderBuf x Registers x Dcache 
: ReorderBuf x Functional ^ ReorderBuf 



The components of this algebra are defined in the following sections. 



Instrnction Fetch. The operation Fetch : Icache x Ibuffer ^ Icache x 
Ibuffer is defined below. 

Fetched cache, ibuffer) = 

{ Fetch{I cache, < Push{instbuf , if not Full{instbu})] 

Icache{fetchpc)), fetchpc + 1 >), 

{I cache, ibuffer), otherwise. 

Instructions are repeatedly fetched from the instruction cache, and added 
to the instruction buffer using the buffer operation Push, until the in- 
struction buffer is full (tested by the buffer operation Full). In ACS, the 
operation is bounded by the size of the instruction buffer. In a real micro- 
processor, the bandwidth of the bus between the instruction cache and 
the instruction buffer will also bound the number of instructions that can 
be transferred in a clock cycle. 



Instruction Decode. The operation 

Decode : Ibuffer x Decode ^ Ibuffer x Decode 
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is defined below. 



Decode{ibuffer , decode) = 

Decode{< Pop{instbu}), fetchpc >, 
< Push{decode, 

DecodeEntry(T op{ instbuf ) ) ) >), 

^ {ibuffer, decode), 



if not Empty {instbuj) 
and not EuU{decode)] 

otherwise. 



Instructions are removed from the instruction buffer using the buffer 
operation Pop, decoded by DecodeEntry, and then added to the decode- 
dispatch unit buffer using the buffer operation Push, until either the 
instruction buffer is emptied (tested by the buffer operation Empty), or 
the decode-dispatch unit buffer is full. The operation is bounded by the 
sizes of these two buffers. 

The hidden function DecodeEntry : Reg OP x RC^ x PC decodes 
an instruction word. 



DecodeEntry{ibufentry) = {op{ibuf entry), ra{ibuf entry), 
r b{ibuf entry) , rc{ibuf entry) , addr {ibuf entry )) . 



There are five fields: an op-code, three register addresses and a data mem- 
ory address; each of these fields is extracted using the functions op, ra, 
rb, rc and addr from Sect. 6. 



Instruction Dispatch. The operation Dispatch : Decode x Issue x 
ReorderBuf x Registers ^ Decode x Issue x ReorderBuf x Registers affects 
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decode-dispatch and issue units, reorder buffer, and registers. 

Dispatch{decode, issue, reorderbuf, registers, Dcache) = 

Dispatch{DispatchAddl{deeode, . . . , registers), Dcache), 

if CanDispatchAddl{deeode, addissuei, addissue 2 , 
reorderbuf, regu ) ; 

Dispatch{DispatchAdd2{deeode , . . . , registers), Dcache), 

if CanDispatchAdd2{deeode, addissuei, addissue 2 , 
reorderbuf, regu ) ; 

Dispatch{DispatchBr{deeode , . . . , registers) , Dcache), 
if CanDispatchBr{deeode, brissue, reorderbuf); 

< Dispatch{DispatchLoad{deeode , . . . , registers ) , Dcache), 
if CanDispatchLoad{deeode, Isrissue, reorderbuf , 
registers ) ; 

Dispatch{DispatchStore{deeode, . . . , registers, Dcache), Dcache), 
if CanDispatchStore{deeode, Isrissue, reorderbuf, 
registers ) ; 

Dispatch{DispatchSet{deeode, . . . , registers), Dcache), 

if CanDispatchSet{deeode, addissuei, reorderbuf, regu); 

^ {deeode, . . . , registers, Dcache), otherwise. 

The first six cases correspond with instruction dispatch for each of the five 
types of instruction. (There are two cases for the add instruction, since 
there are two addition functional units.) In each of these cases the reser- 
vation station is filled and the decode unit entry removed. This process is 
repeated until the seventh case applies and no further instructions are dis- 
patched. Dispatch is bounded in ACS by the size of the decode-despatch 
unit buffer, and/or the sizes of the functional unit buffers. 

Dispatching Add Instructions to the First Adder Unit. The hid- 
den function DispatchAddl controls instruction dispatch to the first ad- 
dition unit’s reservation station, and is defined below. 

DispatchAddl : Decode x Issue x ReorderBuf 
X Registers ^ Decode 
X Issue X ReorderBuf x Registers, 
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Dispatch Addl{decode, issue, reorderbuf, registers) = 
{Pop{deeode), (1) 

Push{addissue\, AddI ssEnt{Top{deeode) , registers , (2) 

reorderbuf ■ tail + 1)), 
addissue 2 , brissue, Isrissue, 

Push{reorderbuf, {wa.it, 0,0)), (3) 

{reg, regu[l / {Top{deeode))], memfi, reset, pc)). (4) 



When an add instruction is dispatched, the following steps are taken. 

1. The decode-dispatch unit buffer is popped. 

2. A reservation station entry is constructed by AddlssEnt, and pushed 
onto the reservation station list of the first adder. This entry includes 
the address of the next free reorder buffer entry {reorderbuf ■ tail + 1). 
Note, it is assumed that the concrete implementation of the buffer 
described in [Fox98] is being used. 

3. An entry for the result of the addition in the reorder buffer (labeled 
‘wait’) is pushed onto the reorder buffer. 

4. The destination register rc is in use {regu[l {Top{deeode))]) . 

The hidden function 

AddlssEnt : DecEntry x Registers x Wiog^(^reordersize) AddEntry 
is defined below. 

AddI ssEnt{deeentry , registers, reorderpos) = 

{DispatchOperand{ra, registers), DispatchOperand{rb, registers), 
rc, reorderpos) . 



The operands ra and rb are fetched (if ready; see DispatchOperand be- 
low) , the destination register is set to rc and the reorder position becomes 
the successor to the current reorder buffer tail. 

The hidden function DispatchOperand : RC x Registers ^ RegOpnd 
is defined below. 



DispatchOperand{sr , registers) 



(0,sr, 0), A regu{sr) = 1; 
(1,0, regsr), otherwise. 



If the source register sr is reserved (that is, the register sr is already in 
use), then the ready bit is set to zero and sr is stored in the reservation 
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station; the contents of sr will be fetched later. If the register is not in 
use then the ready bit is set to one, and the contents of sr are stored in 
the reservation station. 

The hidden function 

CanDispatchAddl : Decode x Addlssuel x Addlssue2 
xReorderBuf x [RC Bit] B 



is defined below. 

CanDispatchAddl{decode, addissuei, addissue2, reorderbuf,regu) = 
tt, if not Empty {decode) and {T op{decode)) = add (1) 

and not Full{addissuei) and not Full{reorderbu}) (2) 
and regu{Tr!^^‘^^^^y {T op{ decode)) ) = 0 (3) 

and {Size{addissuci) < Size{addissuc2) (4) 

or Full{addissue2))', 

^ff, otherwise. 

In order to dispatch an add instruction to the first adder unit: 

1 . there must be an add instruction at the top of the (non empty) decode 
unit; 

2. the first adder reservation station and reorder buffer must not be full; 

3. the destination register for the add must not be in use; and 

4. the first adder reservation station must have fewer or the same number 
of instructions pending execution as the second adder station, or the 
second adder reservation station must be full. This prevents the same 
instruction from being dispatched to both adder reservation stations. 

The process of dispatching branch, store, load and set instructions, 
and of dispatching add instructions to the second adder unit, is similar, 
and we omit the definitions. 



Instruction Execution. The Execute : Issue x Registers x Dcache ^ 
N4 X Functional operation returns a 4-tuple of reservation station loca- 
tions, representing instructions that have been executed, together with 
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the state of each of the functional units, and is defined below. 

Execute{issue, registers, Dcache) = 

{'Kexec{-^dder{addissue\, registers)), 

-Kexec{Adder{ addissue 2 , registers)) , 

TTexec{Branch{brissue, registers)), 

7rexec{LoadStore{lsrissue, registers, Dcache)), 

T^uniti Adder { addissuei, registers)) , 

T^unit{Adder{ addissue 2 , registers)) , 

7Tunit{Branch{brissue , registers ) ) , 

TTunit{LoadStore{lsrissue, registers, Dcache))). 

The functions Adder, Branch, and LoadStore each return a reservation 
station location and functional unit state as a pair. The projection 

VTexec : {0, . . . ,uiax{addsizei, . . . , Isrsize)} 

X (Adder U Branch U LoadStore) 

^ {0, . . . ,max{addsizei, . . . , Isrsize)} 

projects out the reservation station location, and the projection 

T^unit ■ {0, . . . ,uiax{addsizei, . . . , Isrsize)} 

X (Adder U Branch U LoadStore) ^ (Adder U Branch U LoadStore) 

projects the functional unit state. 

Executing Add and Set Instrnctions. The hidden function Adder is 
defined below. 

Adder . (List^m^^^gj^eiAddEntry) List^o,(^^gj 2 :e 2 , Add Entry)) ^ Registers t 
[t), . . . ,Taa,-x.{addsizei,addsize 2 )} x Adder, 

Adder {addis sue, registers) = 

{ {toexec, AddExecute{Eval{addissue,toexec),reg)), if toexec > 0; 
(toexec, (0, 0, 0, 0)), otherwise, 

where toexec = Find{addissue,CanExecuteAddregu) is the reservation 
station location of an add instruction that may be able to execute. The 
list operation Find is defined in [Fox98]. If an add instruction is able to 
execute {toexec > 0), then toexec is returned together with the state of 
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the adder functional unit after executing reservation station entry toexec. 
Note that the oldest instructions in a reservation station list will be exe- 
cuted first, subject to dependencies: see the definition of Find in [Fox98]. 

At this stage of the instruction pipeline, set instructions are indistin- 
guishable from add instructions of the form 0 -F x: see [Fox98] . Therefore, 
we need take no special steps to execute them. 

The hidden function AddExecute : AddEntry x i?e (7 ^ Adder is defined 
below. 



AddExecute{addentry,reg) = 
{GetOperand{regopndi,reg) 

+ GetOperand{regopnd2,reg),dr, l,reorderpos). 



The two register operands are added, the destination address dr and 
reorder unit position are copied, and the done flag is set to one. 

The hidden function GetOperand : RegOpnd x Reg Reg is defined 
below. 



GetOperand{regopnd, reg) 



opnd, if ready = 1; 
regsr, otherwise. 



If ready = 1 then the operand has already been fetched, and is stored in 
opnd. Otherwise the operand is fetched from the appropriate PM-level 
register regsr- 

The family of sets GanExecuteAdd of executable add instructions is 
defined below. 



< GanExecuteAddregu F Add Entry | regu G [PC Bit] >, 



GanExecuteAddregu = 

{addentry £ AddEntry | Ready {addentry ■ reg opnd i, regu) and 
Ready {addentry ■ regopnd2,regu)}. 



An add instruction may be executed if both its register operands are 
ready. 

The function Ready : RegOpnd x [RC Bit] ^ B is defined below. 



Ready {reg opnd, regu) 



tt, if ready = 1 or regu{sr) = 0; 
jj, otherwise. 



The operand is ready if it has already been fetched {ready = 1) or it is no 
longer being used as the destination of a waiting instruction {regu{sr) = 
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0). Note, if an add instruction is using the same register as destination and 
operand, then the ready bit will not be set until after its operands have 
been dispatched: see Dispatch Addl, AddlssEnt and DispatchOperand 
in Sect. 8.3. Therefore, an instruction will not attempt to wait for itself 
to commit before being executed. 

The process of executing branch, store and load instructions is sim- 
ilar, and we omit the definitions. 



Filling the Reorder Buffer. The operation 

Reorderins : ReorderBuf x Functional ^ ReorderBuf 
is defined below. 

ReorderIns{reorderbuf , functional) = 

Reorder Add{ReorderAdd{ReorderBranch{ReorderLsr{reorderbuf, 
loadstore), branch), adder 2 ), adderi). 

Results are inserted into the reorder buffer from each of the four functional 
units. 

The hidden function Reorder Add : ReorderBuf x Adder ^ ReorderBuf 
is defined below. 

Reorder Add{reorderbuf, adder) = 

{ Insert{reorderbuf, {r eg, pad{dest), result), if done = 1; 
reorderpos) , 

reorderbuf , otherwise. 

The reorder buffer entry consists of (i) a destination flag, set to reg in the 
case of an add or set instruction; {ii) the (padded) register destination; 
and {Hi) the result of the addition (result). The position at which the 
entry is inserted within the reorder buffer (reorderpos) is stored within 
the adder functional unit. If no instruction has been executed by the 
addition unit (done = 0), the reorder buffer is unchanged. The buffer 
operation Insert is defined in [Fox98]. 

The hidden function 

ReorderBranch : ReorderBuf x Branch ^ ReorderBuf 
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is defined below. 

Reorder Br anch{reorderbuf , branch) = 

Insert{reorderbuf, (count, result, 0), if done = 1 and taken = 1; 
reorderpos), 

< Insert{reorderbuf , (skip, 0,0), if done = 1 and taken = 0; 

reorderpos), 

reorderbuf, otherwise. 

There are two types of reorder buffer entries for branch instructions. A 
branch which is taken is flagged with count, whereas a branch which is 
not taken is flagged skip. In the case of a taken branch, the branch unit 
result (i.e. the branch address) is stored in the destination field of the 
reorder buffer entry. 

The hidden function 

Reorder Lsr : ReorderBuf x LoadStore ^ ReorderBuf 
is defined below. 

ReorderLsr{reorderbuf, loadstore) = 

Insert{reorderbuf, (reg, dest, result), if done = 1 

reorderpos) , and load = 1; 

< Insert{reorderbuf , (dcache, dest, result), if done = 1 

reorderpos) , and load = 0; 

^ reorderbuf, otherwise. 

If the instruction executed was a load then the reorder buffer entry is 
flagged with reg. If the instruction was a store then the reorder buffer 
entry is flagged with dcache. 



Instruction Committal. The operation 

Commit : ReorderBuf x Registers x Dcache 
ReorderBuf x Registers x Dcache 
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is defined below. 

Commit{reorderbuf,< registers, Dcache >) = 

' {reorderbuf , registers, if (not Empty {reorderbuj^) and (1) 

Dcache), {T op{reorderbuf ) ) = wait) 

or Empty{reorderbuf)] 

^ {Reset{reorderbuf),CommitResult{Top{reorderbuf), registers , 

Dcache)), if not Empty {reorderbuf) and (2) 

reorderentry / rr / 7 T t\\ . 

^type {I op{reorderbuj)) = count] 

Commit{Pop{reorderbuf), < C ommitResult{Top{reorderbuf) , 

registers, Dcache >)), otherwise. (3) 

There are three cases to consider. 

1. There are no results to be committed, either because the reorder buffer 
is empty or because the topmost entry is waiting for instruction re- 
sults. In this case, the reorder buffer, registers and data cache are 
unchanged. 

2. A branch has been taken. The branch is committed and the reorder 
buffer is reset. 

3. A functional unit result is to be committed. The topmost reorder 
buffer entry is committed and an attempt is made to commit the next 
result. 

The commit function is bounded by the size of the reorder buffer. 

The hidden function 

CommitResult : ReorderEntry x Registers x Dcache ^ Registers x Dcache 
is defined below. 

CommitResult{reorderentry , registers, Dcache) = 

{reg[word/trim{desf)\, regu[t)/trim{dest)], if type = reg; (1) 
memfi, reset, pc + 1, Dcache), 

{reg, regu, Remove{memti, if type = dcache; (2) 

< Find{mem'ii,=dest)) -,t&set,pc + 1, 

Dcache [word/ dest ] ) , 

{reg, regu, memd, l,dest, Dcache), if type = count; (3) 

{reg, regu, mefrvd, reset, pc + 1, Dcache), if type = skip; (4) 

There are four cases to consider. 
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1. A result is to committed to a register. The value is written to the 
register, the register usage table is modified such that the register 
location dest is no longer active, and the program counter is incre- 
mented. The trim function removes the field-padding zeros added by 
pad in Reorder Add (Sect. 8.3). 

2. A result is to be committed to the data cache. The memory address 
usage list is updated by removing the entry with address dest, the 
program counter is incremented, and the result written to the data 
cache. 

3. A branch was taken. The program counter is set to the destination of 
the branch (dest), and the reset flag is set to one. 

4. A branch was not taken (skip). The program counter is incremented. 

9 Correctness and Verification 

We will formulate the correctness conditions for our superscalar imple- 
mentation with respect to its architectural description, using the tech- 
niques described in Sect. 3 and Sect. 5. We will then discuss the process 
of verification, and consider why the usual technique is problematic in 
the case of superscalar processors. 



9.1 A Correctness Definition for ACS 



The implementation ACS is correct if (given maps p : STATE 

[5 ^ T] and V’ : ACS STATEpm defined below) the following 

diagram commutes for all clock cycles s = start{X{state)){s) and state G 

STATEj^CS-- 



T X STATEpm 



State 



PM 



S X STA TE 



State 



-ACS 



ACS 



STATEpm 

stateacs 



The map V' : ACS STATEpm is defined below. 

ip{state) = {Icache, Dcache,pc,reg), 



where Icache, Dcache, pc and reg are those parts of state also present 
in STATEpm- Let i? be a retirement clock; each cycle of clock R cor- 
responds with the committal of a number of instructions. The map p : 
STATE ^ [5 ^ r] is defined by p{b) = Ai( 6 )A 2 ( 6 ), with the retimings 
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Ai : STATEpM ^ Ret{T,R) and As : STATE Ret{S,R) defined 
by their respective immersions below. 

Xi{state){0) = 0, 

\2{state){0) = 0, 

Xi{state){r + 1) = duri {State j^(jg(X2 (state) (r ) , state)) + Xi{state){r), 
As (state) (r + 1) = dur2 {State j^(jg(X2 {state ) (r ) , state)) + As(state)(r). 

The duration functions duri : STATE T~^ and durs : STATE 
where T+ = {t G T | t > 0} and 5+ = {s G 5 | s > 0} are defined below. 

duri(state) = C ommitted{next‘^^^^^^^^\state)) , 

dur2(siate) = least s G S~^ such that CanCmt{next^j^fjg{state)) . 

Duration function duri counts the number of instructions committed for 
each cycle of clock R. Duration function durs counts the number of system 
clock cycles for each cycle of event clock R. 

The function Committed : STATEj^(jg T~^ gives the number of 
instructions committed from a given state. 

Committed{state ) = 

0, A not CanCmt{reorderbuf)] 

1 + greatest n G N, otherwise, 

such that: \/ m < n, 

CanCmt{Pop^{reorderbuf)), 

The function CanCmt : ReorderBuf ^ B is defined below. 

{ tt, if not Empty{reorderbuf) 

and (T op{reorderbuf) ) / wait; 

ff, otherwise. 

9.2 Verifying ACS 

Space precludes a full formal discussion of the normal process of verifying 
the correctness of a microprocessor, expressed as an iterated map in the 
form above. However, informally, the argument proceeds as follows. Mi- 
croprocessors expressed as iterated maps are functions only of their initial 
state, some number of clock cycles, and (possibly) some inputs. They do 
not depend on the numeric value of time. That is, given an initial state (Jq, 
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and neglecting inputs, suppose we run a microprocessor representation F 
for t\ + 12 clock cycles, and finish (at time t = ti + 12 — 1 'm state an)- We 
would reach the same state an as if we first ran F for ti cycles, reaching 
state at, reset time to zero, and then ran F for t 2 cycles, now starting in 
state at- 

Logically extending this argument, given StatepM '■ TxSTATEpm 
STATE pM, and StateAC ■ S x STATE STATE together 
with retiming A : STATE j^(jg Ret{S,T) and projection function 
V' : STATEj^fjg STATEpm, to show that State ac is a correct im- 
plementation of State PM it is sufficient to establish the following for all 
state ^STATEj^(jg. 

1. State pm{0,'^ {state)) = "ip {State ac{^, state)). 

2. State PM{^,tp {state)) = tp {State Ac{^{state){l), state)). 

3. At time X{state){l), the state of StateAC is consistent with correct 
future execution. The easiest way to establish this is to use initAC as 
an invariant, and show that 

State AC {^{state){l), state) = init ac {State Ac{^{state){l), state)). 

This requires that initAC is as weak as possible: that is, initAc{state) 
will leave elements of state G STATE j^(jg unchanged unless they are 
inconsistent (e.g. m{pc) / ir: see Sect. 2). 

This can significantly simplify formal verification. A more formal discus- 
sion can be found in [FH98] , and a full account in [Fox98] (including the 
conditions STATEpm, STATE j^(jg, A and ip must satisfy). The same 
simplification has also been observed, within the framework of their own 
formalisms, by others working on microprocessor verification; for exam- 
ple, [WC94, MS95b, MS95a, WB96, Bur96, SDB96, Cyr96]. 

There are several difficulties in the case of superscalar microproces- 
sors. 

1. The size of the state-space makes establishing that 

State PM{^,tp {state)) = ip {State Ac{^{state){l), state)) 

difficult, simply because of the number of cases to consider. A large 
proportion of the possible cases will be disallowed by initAC] but even 
so, the number remaining is very large [Fox98]. 

2. The complexity of the relationships within the state-space makes it 
difficult to construct an appropriately-weak initialisation function. 
The current initialisation function (Sect. 8.2) for ACS simply resets 
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the pipeline, and is very strong. However, given the size of the state- 
space, the complexity of the relationships between state components, 
and the consequent number of possible, consistent values for each of 
the state elements, a weak initialisation function is extremely complex 
[Fox98] . 

3. As well as being complex to construct and check, such an initialisation 
function will consume considerable resources in automated verification 
attempts because of the need to check 

State AC state) = init ac {State Ac{^{state){l), state)). 

It is not clear how to address these problems. In [Fox98] a systematic 
method for constructing initialisation functions for pipelined processors 
is described, where the state of the pipeline can be uniquely determined 
by the immediately-preceding instructions. This will not work for ACS, 
where the state of the pipeline may be influenced by instructions that 
passed though in the (potentially distant) past. (For example, the choice 
of which addition unit to send an add instruction to will be influenced by, 
among other things, the sizes of the queues at each unit, which in turn 
will be influenced by the number and distribution of add instructions 
in the past.) The problems essentially stem from the ‘complexity’ of the 
state-space, where ‘complexity’ in this context is some measure of {i) the 
number of separate state components; and {ii) the relationships between 
the state components. Point (i) leads to a large number of cases to be 
checked. Point {ii) makes establishing that the processor is in a legal 
state, corresponding to one of the cases, complex and time consuming. 

We are considering two possible approaches to reducing state-space 
complexity. Firstly, making concessions in the implementation to sim- 
plify the complexity of the state-space. This obviously could negatively 
affect performance, and, at first sight, seems to imply a return to less 
advanced implementations. However, this need not be the case. The aim 
is more subtle than simply making the state-space ‘smaller’: recall that a 
monolithic memory can be very large, yet is conceptually simple. It may 
be possible to reduce the number of discrete state components, and the 
complexity of their interrelationships, while still maintaining the potential 
for high instruction throughput. The second approach involves inserting 
a new level of abstraction between the current PM and AC levels. By 
doing this, it may be possible to conceal some of the complexities of the 
current AC level and hence simplify the representation of the processor. 
It would, of course, still be necessary to verify that the AC level is correct 
with respect to this new level of abstraction. However, we could consider 
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each of the physical units of the processor in isolation, which would make 
verification more tractable. 



10 Concluding Remarks 

We have shown that the algebraic tools developed for representing simple, 
non-superscalar microprocessor implementations are equally applicable to 
complex superscalar examples. The algebraic techniques are not specific 
to any particular software, but are adaptable to a range of currently- 
available tools. We have developed, in considerable detail, a superscalar 
implementation. We have formulated the correctness conditions for the 
implementation with respect to an architecture. We have also briefly dis- 
cussed the problems of formal verification. Future work will consider sim- 
plifying formal verification, by studying possible alternative techniques for 
reducing complexity. In addition, we intend to consider levels of abstrac- 
tion higher than the current PM level. [Ste96, Ste98] consider algebraic 
models of high-level languages, eompilers and abstraet maehine languages 
in a form very similar to our models of hardware. We wish to bridge the 
gap between abstract machine languages, and the current PM level, in 
order to construct a unified algebraic model of computer systems, from 
high level languages to abstract hardware. 
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Abstract. First, we study the general idea of a spatially extended 
system (SES) and argue that many mathematical models of systems 
in computing and natural science are examples of SESs. We exam- 
ine the computability and the equational dehnability of SESs and 
show that, in the discrete case, there is a natural sense in which an 
SES is computable if, and only if, it is dehnable by equations. We 
look at a simple idea of hierarchical structure for SESs and, using 
respacings and retimings, we dehne how one SES abstracts, ap- 
proximates, or is implemented by another SES. Secondly, we study 
a special kind of SES called a synehronous eoneurrent algorithm 
(SCA). We dehne the simplest kind of SCA with a global clock 
and unit delay which are computable and equationally dehnable by 
primitive recursive equations over time. We focus on two examples 
of SCAs: a systolic array for convolution and a non-linear model 
of cardiac tissue. We investigate the hierarchical structure of SCAs 
by applying the earlier general concepts for the hierarchical struc- 
ture of SESs. We apply the resulting SCA hierarchy to the formal 
analysis of both the implementation of a systolic array and the 
approximation of a biologically detailed model of cardiac tissue. 



1 Introduction 

Modern computing hardware comprises a great range of devices, includ- 
ing digital and analogue components, which are studied at a number of 
levels of abstraction, determined by the physical technologies, circuit lay- 
outs and the architectures seen by programmers. A comprehensive theory 
of hardware design should encompass both this diversity and hierarchy 
of levels of abstraction. It should accommodate changes caused by the 
emergence of new physical technologies and programming techniques. A 
comprehensive theory of hardware design might also integrate the digital 
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and analogue components of systems, bridging computers and electro- 
mechanical devices, from signal processing systems to biological tissue. 

We are far from having a comprehensive theory of hardware. The de- 
sign of transistors and circuits requires many mathematical models aimed 
at different levels of abstraction. Architectures have only recently become 
accessible to mathematical modelling, and we are struggling to support 
their design and programming with formal techniques. Technology in- 
dependence and software-hardware codesign are popular research areas. 
The theoretical unification and integration of digital systems with ana- 
logue environments is a challenge. 

In contemplating and seeking a theory of hardware, we are led to some 
significant scientific problems, not least among which is the following: 

Integrative hierarchy problem. Develop a mathematical theory 

that is able to relate and integrate different mathematical models 

at different levels of abstraction. 

In the case of conventional hardware this involves relating and combining 
methods for the modelling of physical systems which are normally contin- 
uous, with the methods for modelling architectures which are normally 
discrete. 

It turns out that this integrative hierarchy problem is not only a prob- 
lem for hardware: it is a problem for the scientific modelling of physical 
systems in general. The levels of hierarchy are often determined by spe- 
cific scientific or engineering ideas, which vary with the problem area 
and its state of the art. In some areas, such as the non-linear dynami- 
cal properties of physiological excitable media (e.g., neural and muscular 
tissue), the hierarchical nature is evident in the science of neurones and 
cardiac cells. It is particularly clear in the case of hardware because of 
the pre-eminent role of the hierarchy. 

All types of systems, from microprocessors to neural or muscular tis- 
sue, are modelled at different levels of abstraction for different purposes. 
Inevitably we need to compare models. In microprocessor design we often 
see this comparison as a correctness problem. In the case of, say, models 
of cardiac tissue, the existence of two partial differential equations that 
model a system at different levels of abstraction leads to two algorithms 
that ought to be related. In scientific modelling we often see this compar- 
ison as an approximation problem. 

In understanding a complex system, a portfolio of models is needed, 
representing the system at different levels of abstraction. The family of 
models might range from abstract qualitative models of the system that 
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are computationally efficient to concrete quantitative models that are 
computationally intensive. They might also offer different views of the 
system relevant to different users. 

In this paper we will describe an approach to the integrative hierarchy 
problem in general. It is limited by the assumptions that 

all the mathematical models in the hierarchy are algorithmic mod- 
els of spatially extended systems, and 

all the algorithmic models of spatially extended systems are syn- 
chronous concurrent algorithms. 

We begin, in Section 2, by studying the concept of a spatially extended 
system (SES). We give some simple mathematical definitions that en- 
compass examples of continuous physical models and discrete computing 
systems. The gap between the continuous and discrete models can be 
bridged by appropriate abstract concepts, such as spatially extended sys- 
tem. 

In Section 3 we discuss the computable representations of SESs and 
their mathematical characterisation by means of equations. Using the 
algebraic theory of data, we show that 

a countable spatially extended system can be computably modelled 
if, and only if, it can be uniquely characterised by a small finite 
set of equations. 

In Section 4 we formulate notions of hierarchy for SESs by giving simple 
refinements of space, time, data, and behaviour. 

Primarily, the aim of this paper is to study the hierarchical structure 
in a special class of algorithmic models of SESs called synchronous concur- 
rent algorithms (SCAs), and to show the applicability of the hierarchical 
SCA framework to both computing and physical systems. 

A synchronous concurrent algorithm (SCA) is a network constructed 
from modules, channels, data and clocks. The modules are distributed in 
space and are connected by channels. The modules process data and the 
data moves around the network, from module to module, via the channels. 
The modules and channels operate simultaneously and are governed by 
one or more clocks. Thus the network is an algorithm that is synchronous 
and concurrent. “Synchronous” means “with time”. The algorithm can 
operate continually over periods of time and can process sequences or 
streams of input data. A formal definition of a simple kind of SCA is 
given in Section 5. 
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Algorithms that match this loose description of an SCA can be found 
in abundance. In computer science, SCAs are a fundamental structure 
that can be found in the design of digital hardware that include: 

— general purpose hardware (e.g., microprocessors); 

— special purpose hardware (e.g., graphics machines, systolic arrays); 

— parallel computer architectures (e.g., hypercubes); and 

— general parallel models of computation (e.g., PRAMs). 

Information on SCA-related models of microprocessors can be found in 
[HT96]. SCA models of systolic algorithms and graphics machines are 
discussed in [HTT 88 ] and [EST90], respectively. SCAs can also be found 
in other parts of computer science, although sometimes less obviously. A 
general introduction to models of parallel computation is [Sav98]. 

In mathematics and the natural and engineering sciences, SCAs arise 
in algorithmically approximating the solutions of: 

— coupled ordinary differential equations (CODEs); and 

— partial differential equations (PDEs). 

Approximation techniques, like the finite difference method (FDM) and 
finite element method (FEM), are based on spatial intuitions and can 
lead to algorithms that qualify as SCAs. In addition to approximation 
algorithms for differential equations, a number of discrete and more al- 
gorithmic approaches to modelling systems have been created, perhaps 
most notably: 

— neural networks; 

— cellular automata; and 

— coupled map lattices. 

These subjects are surveyed in [RM 86 ], [W 0 I 86 ] and [Kan92], respectively. 
In Section 5 we give examples of SCAs from hardware and physiological 
excitable media. 

These SCA methods of modelling physical systems have several fea- 
tures in common. They start from a view of space and time that is dis- 
crete. They focus on the local behaviour of a point or small area in space 
and postulate rules for the interactions at that site. The algorithms syn- 
thesise global behaviour from the local behaviour. The parallelism is the 
result of calculating local behaviour at all sites in the space simultane- 
ously. 

In Section 6 we address the question that lies at the heart of the paper: 




188 



M.J. Poole, A.V. Holden, and J.V. Tucker 



How does one SCA abstract, approximate, or implement another 
SC A? 

Using the principles given for spatially extended systems in Section 4, 
we will analyse the relationship between two SCAs at different levels of 
spatial, temporal and data abstraction. 

We then apply our analysis to the formal description of the relation- 
ship existing between SCAs in hardware and excitable media. We treat 
the cases of systolic array architectures for convolution, and models of 
electrical behaviour in cardiac tissue. For both case studies, we compare 
models at two levels of abstraction. 

The pre-requisites for this paper are the elements of the algebraic 
theory of data and some interest in non-linear systems. References will 
be given in the appropriate places. 

2 Spatially Extended Systems 

An aim of our theory is to show that a variety of deterministic systems in 
computing, mathematics, natural science and engineering have a common 
algorithmic structure. First, here and in Section 3, we demonstrate that 
the conceptions of deterministic systems found in these disparate areas 
have a common algebraic form that can be defined by equations. The 
algorithmic models arise from these algebraic models. 

2.1 Spatially Extended Deterministic Systems 

To model a system, we must choose properties of the system and char- 
acterise its behaviour over time in terms of these properties. The chosen 
properties are used to define a notion of state for the system, and a set 
of all possible states of the system. 

Let W be a set of points, or sites, where the properties of the system 
are measured. The set X provides names for locations in some geometric 
space. Let A be a set of data used to measure the properties of the system 
at each location in X. Then a state of the system is a function s : X ^ A 
such that for x ^ X, 

s{x) = data characterising the system’s state at point x. 

The set of all possible states of the system is, therefore, a subset S{X, A) 
of the set [X A\ of all functions with domain X and codomain A. In 
many cases 5(W, A) = [W — >■ A]. In other cases, such as the vibrating 
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string in Section 2.3, c [X — For simplicity, we have chosen 

the single-sorted case in which the set A is used to describe possible states 
of all points; it is not difficult to extend the approach to the many-sorted 
case. 

The set T of times at which we wish to know the behaviour of the 
system is defined by a clock that may, in general, be discrete or continu- 
ous. The key requirement is that time is modelled by an infinite, linearly 
ordered set with an initial cycle or instant. For discrete time we take 
T = IN, the natural numbers, and for continuous time we take T = IR"*", 
the non- negative real numbers. 

In addition, there may be a set P of parameters that are inputs to the 
system and affect its behaviour at each time instant. The parameters may 
change over time and so a stream . . . ,pt, . . . of parameters, with pt ^ P 
for each time instant t C T, is involved in the operation of the system. A 
stream of parameters is a map p : T — >■ P such that, for all t C T, 

p{t) = parameter value at time t. 

The set of all possible parameter streams for the system is, in general, a 
subset S{T,P) of [T — >■ P]. Often, in the case T = IR"*", the stream p is 
called a signal. 

The fact that the system is deterministic means that 

at each time point the system can be in one and only one state, 
and this state is uniquely determined by the time point, the initial 
state of the system, and a stream of parameters for the system. 

In fact, the state depends only on the initial segment of the parameter 
stream defined by the current time point, rather than on the entire stream. 
These ideas are expressed mathematically as follows. 

Definition: spatially extended system. A spatially extended deter- 
ministic system model S consists of: 

— a non-empty set X of points distributed in space; 

— a non-empty set A of data-, 

— a set ^(Ai, A) C [X ^ A] of global states; 

— a non-empty set T of time points; 

— a non-empty set P of input parameters; 

— a non-empty set S (T, P) C [T — >■ P] of input parameter streams; and 

— a system function 



F:Tx S{T, P) X P(A:, A) S{X, A) 
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defined for alH e T, p e S{T, P) and s e S{X, A), by 

F{t,p,s) = state of the system at time t evolving under input pa- 
rameter stream p from initial state s. 

The system state at time t is determined by an initial segment of the 
input stream: for all pi,P2 £ S{T,P) where pi{t') = P 2 {t') for all t' < t, 
F(t,pi,s) = F(t,p 2 ,s). 

The spatially extended deterministic system model is a 6-sorted alge- 
bra 

S = (X,A,T,P,S(T,P),S(X,A) IF). 



Definition: closed spatially extended system. A closed spatially 
extended deterministic system model S' is a spatially extended system 
without input parameter streams. A closed system has space set X, data 
set A, global state set S(X, A), time set T and system function F :T x 
S(X, A) — >■ S(X, A). A closed spatially extended system model is a 4- 
sorted algebra 

S = {X,A,T,S{X,A)\F). 

Examples of SESs are given shortly, in Section 2.3. 



Generalisation to multi-valued local states. We will call a local 
state the data that characterise the state of a system at any point x. 
In practice, a local state will often have many components, and these 
components may be of different units (and different sorts) across the 
system. Thus, variable names and identifiers will be used in states. Here 
we extend the basic model to allow each point to have a finite number of 
named properties or measurements of the same sort. Extending this idea 
to the many-sorted case is straightforward. 

Let Ear be a set of names that may be used to identify the components 
of local states. Let var : X — >■ Powersetf{Var) — {0} be a map that 
assigns each point x ^ X with a non-empty finite set var(x) C Ear of 
names that identify each of the state-components of point x. A local state 
at point X is described by a mapping of the form G where, for 

V e var{x), Sx{v) G A is the v-component of the local state at x. The set 
of all possible local states at x is Let 

5(W, A) C {s e [a: ^ IJ I s(a;) ^ for all x e X} 

xex 
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denote the set of all possible global states of the system. For s e A), 
s{x) e j\var{x) jg local State at point x and s(x)(i;) C ^ is the v- 
component of this local state. 

This simple naming of state components allows us to describe sys- 
tems with vector-valued local states. Let IN C V ar and let var{x) = 
n{x)} for each x ^ X, where n : X — >■ IN determines the length of 
local state vectors at each point throughout the system (we require that 
n{x) > 0 for all x C X). A local state of each point x ^ X is a vector of 
the form C where Sx{i) C A (for 1 < i < n{x)) is the i-th element 

of the local state of x. 

It is useful to be able to identify a state component across the space 
X. Let (7 C X X Far be a set of global state coordinates defined by 

C = {{x,v) \ X G X and v e var{x)} 

where coordinate (x, v) refers to the ri-component of the local state at 
point X. Let 

5(AT, A)C[C ^ A] 

denote the set of all possible global states of the system, where for s e 
^(X, A), s{x,v) e A is the ri-component of the local state at point x. 

Observable and hidden states. Another generalisation to the basic 
SES model is to distinguish between those points of a system whose states 
are external or observable, and those that are internal or hidden. 

Let Xobs C Ai be the set of points in the system whose states are 
observable, and let X^id = X — X^hs be the set of remaining points, 
whose states are hidden. The set Sobs{X,A) C [Xobs — >■ A] of all possible 
global observable states of the system is defined by 

^ [^obs ^ I ^obs '^Iao^s some s G S[X, A)}. 
An observable state function 

Fobs : T X S{T, P) X S{X, A) ^ Sobs{X, A) 

is easily defined from the system function F, for all t G T, p G S{T,P) 
and s G S' (X, A) , by 

Fobs{t,P,s) = F{t,p,s)\x,^,- 

For a system with multi-valued local states, we can define a set Cobs ^ 
C to be those state coordinates whose values are observable, and let 
Chid = C — Cobs be those remaining coordinates whose states are hidden. 
The set Sobs{X, A) of global observable states, and global observable state 
function Fobs are defined in a straightforward manner. 
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2.2 A Classification of Spatially Extended Systems 

We illustrate these general definitions by noting some common terminol- 
ogy for their components in computing and physical systems: 



Elements of 
System 


Computing 

System 


Physical 

System 


A 


locations, variables, 
identifiers, names, 
registers 


points, sites, 
locations, cells, 
nodes 


A 


data 


data 


T 


discrete time points, 
instants, cycles, 
intervals 


continuous and 
discrete time points, 
instants 


P 


input data, instructions, 
parameters, events, 
messages, 
button presses 


physical parameters, 
boundary conditions, 
perturbations, 
tolerances 



In continuous space systems, usually A is a geometric object, such 
as a compact smooth manifold, which can be embedded as a subspace of 

IR”. 

In discrete space systems, X can be a discrete subset of points from a 
geometric object (for example, a subset of the integer lattice ZZ^ C IR” 
or a finite element mesh). 

The components A, T, and A (and P) are used to classify models of 
physical systems in terms of space, time and state (and parameters). For 
example, the primary characteristic of a component A, T or A (or P) is 
whether it is discrete or continuous. There are 8 (or 16) cases; those of 
interest to us are listed in the table below, where “D” denotes “discrete” 
and “C” denotes “continuous”: 



Space 


Time 


State 


Parameters 


Example 


C 


C 


C 


C 


Solutions to PDEs 


D 


C 


C 


C 


Solutions to CODEs 


D 


D 


c 


c 


Coupled Map Lattices 


D 


D 


D 


D 


Cellular Automata 
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2.3 Examples 

Consider briefly some simple examples of spatially extended systems from 
hardware and mechanics to illustrate the definitions. 

Example ( Computer). A computer is a spatially extended discrete system. 
Consider a simple l-bit machine, illustrated in Figure 1. 



s{pc) 




s(ri) 


5 (^ 2 ) 


s(r3) 



pc 



ri 



r2 



rs 



s(^fe) 



rk 



s(mi) 



’{m 2 ) 



5 (m 3 ) 



5(m„) 



mi 



m 2 



m 3 



Fig. 1. A simple computer. 



The space X is the set 

I = {pc,ri,r2, . . .,rk,mi,m2 , . . . ,m„} 

of names for the program counter pc, registers r\,r 2 , ■ ■ ■ ,Vk and mem- 
ory locations mi, m 2 , ■ ■ ■ , mn- The computer is characterised by the data 
at these locations. Let IT; = {0, 1}^ be the set of Lbit words. A state 
of the computer is a function s : / — >■ IT; such that for i G /, s(i) = 
l-bit word stored at location i. Let [I — >■ IT;] be the set of all states. Time 
T is discrete and is represented by the set IN of natural numbers. Fur- 
thermore, let P be the set of input data; the computer reads a datum at 
each clock cycle. The operation of the computer in time is specified by 
F : T X [T ^ P] X [I ^ IT;] ^ [I ^ IT;] where, for t e T, p e [T ^ P], 
s G [/ — >■ IT;] and i G /, 

F{t,p, s){i) = l-bit word stored in the computer at location i at time 
t on starting the machine with input stream p from 
initial state s. 

The system function is specified by a system of equations. 

Example (Vibrating String). An elastic vibrating string, fixed at either 
end, is a closed spatially extended continuous system; see Figure 2. The 
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space X is the interval [a, b] which represents the length of the string at 
rest, and the string is characterised by its displacement from the hor- 
izontal and its velocity. Let var{x) = {d,v} for all x e [a,b] so that 
values of the state coordinates {x, d) and (x, v) give the displacement 
and velocity of the string at location x. Let the set ^([a, 6],1R) of all 
possible states be the subset of [[a, b] x {d, v} — >■ IR] for which displace- 
ments and velocities are continuously differentiable across [a, b] and have 
zero values at a and b. Time is continuous and is represented by the 
set IR"*" of non-negative real numbers. The system function is of the form 
F : 11“*“ X ^([a, 6], IR) — >■ ^([a, 6], IR) where, for t e IR"*" and s e ^([a, 6], IR) 

F{t,s) = displacement and velocity of the string at time t from 
the initial string state s. 

The system function F is specified by the one-dimensional wave equation 

d‘^y ^ 2 ^ 
dF ^ dx^ 

(where y, t and x are displacement, time and space variables respectively, 
and c is a constant derived from the string’s mass and tension) in the 
following way: if y is a solution to the wave equation with initial string 
position y(x, 0) = s(a:, d) for x G [a, b], and initial string velocity 

||(t,0) = s{x,v), 

for X G [a, b], then for all t G 1R“*“ and x G [a, b], F{t, s){x, d) = y{x, t). 




a 



X 



b 



Fig. 2. A vibrating string. 



3 Computability and Equational Specification of Spatially 
Extended Systems 

The general definitions in Section 2 specify the mathematical form of a 
large class of models of deterministic spatially extended systems and their 
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behaviour in time. Both computing systems and physical systems are well 
represented in this class. In this section we reflect on general methods of: 

(i) defining the models using systems of equations; and 

(ii) defining the models using algorithms. 

Obviously, such methods are central in the modelling of both computing 
and physical systems. We believe that they have a common form. For this 
section, we assume the reader is familiar with the theory of algebraic spec- 
ifications, as described in [Wir91] and [MT92], for example. The contents 
of this section are not used in the rest of the paper. 

3.1 Modelling Systems 

Among the aims of mathematically modelling a system are simulation 
and analysis; specifically, the following stand out: 

Computing. To compute the system function F. 

Reasoning. To prove properties of the system function F . 

For a computer system, the computation of F is its raison d’etre: the 
system is intended to compute. In natural science, where a dynamical 
system models an aspect of a physical system, often the practice is to 
compute F in order to explore possible causes of the known behaviour of 
the system, to make predictions, and to discover new properties of the 
system. In applied science and engineering design, often the practice is to 
compute F in order to answer quantitative questions needed in solving a 
specific design problem. 

In both cases, we reason about the system to establish properties of its 
behaviour over time. Does it accomplish certain tasks, or exhibit certain 
properties? Is it fast, stable, error prone, etc.? 

For the design or investigation of a deterministic system, it is necessary 
to devise a mathematical model of the following form and answer the basic 
questions: 

System Characterisation Problem. Let 

S = {X,A,T,P,S{T,P),S{X,A)\F) 
be a spatially extended deterministic system model with system function 
F :Tx S{T, P) X 5(A:, A) S{X, A). 
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Computing. Does there exist an algorithm to compute F, or an 
approximation to F? 

Reasoning. Does there exist an axiomatic specification to char- 
acterise F uniquely, or to reason about F or an approximation of 
F? 

The problem of representing space, time, data, states, parameters and 
system functions on a computer involves both algorithms and specifica- 
tions. 

In computer science, a system model is usually discrete. The theoret- 
ical analysis of computing systems has led to equational techniques for 
their mathematical description, using methods based on Mathematical 
Logic and Abstract Algebra. 

In particular, equations can be found that provide an axiomatic spec- 
ification of the system F, such that: 

(i) the equations express key properties of the system; 

(ii) the function F satisfies the equations; and, often, 

(iii) F is the only solution of the equations, given certain constraints 
determined mainly by the semantics/model theory and proof the- 
ory of equations. 

In science and engineering, a system model is usually continuous, and 
a theoretical analysis of a system uses differential equations. Mathemat- 
ical Analysis provides a great deal of information about the solutions to 
differential equations. The differential equations constitute an axiomatic 
specification of the system function F, such that: 

(i) the equations express key properties of the system; 

(ii) the function F satisfies the equations; and, often, 

(iii) F is the only solution of the equations, given certain constraints 
determined mainly by the theory of continuity and differentiability. 

Since the model has continuous states, an approximation Fq to F must 
be computed, and algorithms for Fq are needed. 

In both cases, there is a need for theoretical understanding of mathe- 
matical models that comes from an individual’s curiosity and the wish to 
solve practical design problems. Axiomatic specifications, such as equa- 
tions, express the system dynamics, have F as a solution, and are used for 
reasoning. Algorithms for computation are needed, whether the function 
F is computable exactly or approximately. 




Hierarchies of Spatially Extended Systems and Synchronous Concurrent Algorithms 197 



3.2 Equational Specification of Computable Data Types 

The algebraic theory of data is concerned with computability and ax- 
iomatic specifications and can help provide answers to the System Char- 
acterisation Problem in Section 3.1. 

Many sorted algebras are used to model data and operations on data. 
For an account of the subject, see [MT92]. A many sorted algebra can be 
represented on a computer if there exist appropriate data and algorithms 
to represent the sets and operations of the algebra. Let us say that such an 
algebra is effective. There are various mathematical concepts that charac- 
terise which algebras can be represented on a computer. For an account 
of the subject see [SHT95]. 

In the case of countable algebras viewed as discrete structures, there 
are principally: 

— computable algebras, 

— semicomputable algebras, and 

— CO- semicomputable algebras 

which have been well-studied. A computable algebra is one which pos- 
sesses a representation by an algebra of natural numbers in which the 
carrier sets, basic operations and equality relation are all recursive. Semi- 
computable and co-semicomputable algebras are weaker in that their 
equality relations are recursively enumerable and co-recursively enumer- 
able respectively; see [SHT95] . 

In the case of uncountable algebras, viewed as continuous or topolog- 
ical structures, the situation is more complex. Effective topological alge- 
bras are defined by means of effective countable discrete structures that 
provide effective approximations. There are various methods for defining 
algorithmic approximations to topological algebras including: recursive 
metric spaces [Mos64], type two enumerability [Wei87], and domain rep- 
resentability [SHT88,SHT95,Bla97]. The approaches converge and there 
is emerging a stable theory of computation on topological algebras: see 
[SHT97]. 

At the heart of the theory of data there are axiomatic specification 
methods for defining countable algebras uniquely up to isomorphism, 
principally: 

— equations with initial algebra semantics, 

— equations with various pre-initial algebra semantics based on term 

rewriting semantics, and 

— equations with final algebra semantics. 
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Concepts of computability and equational definability have been com- 
bined in a series of classification theorems, due to J A Bergstra and 
J V Tucker. These characterise those countable algebras that can be rep- 
resented on a computer using equations. Roughly speaking, the theorems 
have the following form: 

Let A be a many sorted algebra. The following are equivalent: 

(i) A is definable by algorithms; and 

(a) A is uniquely definable by a finite set of equations. 

Here is one such theorem that we will apply to dynamical systems shortly, 
in Section 3.3: 



Theorem [BT82]. Let A be a many sorted minimal S-algebra with n 
sorts. The following are equivalent: 

1. A is computable. 

2. There is an equational specification {S',E') such that 
(i) Sorts{E') = Sorts{E); 

(a) S' — S contains 3(n -|- 1) hidden functions; 

(Hi) E' contains 2{n + 1) equations; 

(iv) A = Initials' , E') = Final{S' , E') . 

The power of equations to capture, in principle, all we require of a 
data type can be illustrated in many ways; in the case of the theorem 
above we need only contemplate the following table of sizes of equational 
specification: 



Sorts 


1 


2 


3 


4 


5 


6 


7 


8 


Hidden Functions 


6 


9 


12 


15 


18 


21 


24 


27 


Equations 


4 


6 


8 


10 


12 


14 


16 


18 



3.3 Systems as Data Types 

A spatially extended system is a data type and is modelled by an alge- 
bra. The Characterisation Problem of Section 3.1 can be reformulated 
algebraically in the light of the ideas in Section 3.2: 
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Algebraic System Characterisation Problem. Let 

S = {X,A,T,P,S{T,P),S{X,A)\F) 
be a spatially extended deterministic system model. 

Computability. Is the algebra effective? 

Reasoning. Does there exist an axiomatic specification to charac- 
terise uniquely and reason about the algebra, or about an approxi- 
mation to the algebra? 

To address this algebraic problem we must look closely at the algebraic 
structure of a spatially extended system. Clearly, the system algebra S 
simply captures the system behaviour. Obviously, there are more func- 
tions on the carrier sets that are of importance in the design of a model. 
The purpose and nature of these functions vary greatly over the different 
models — only the system function is common to all models. 

For example, in the case of the space X there might be functions 
that construct a discrete grid or mesh, specify a directed multigraph, or 
define the sub-basis of a topology on A. In the case of the data set A 
there might be logical operations on n-bit words, or continuous functions 
on the real numbers; and similarly for the input parameter set P. In the 
case of time, functions expressing system delays might be present. Among 
the operations on states and streams will be evaluation and substitution 
functions. 

The key point is that from the basic assumptions about the spatially 
extended system, a family of functions will be developed from which the 
system function will be definable by systems of equations. 

Let us introduce some terminology. 



Definition. An algebra Sq is a construction of the system algebra S if 
S' is a reduct of Sq, i.e., So|i; = S where X is the signature of S. 

Obviously, these ideas are very general and a great deal of variation 
in a construction Sq of S is possible. However, with the help of the theory 
of data, some general facts can be proved, at least in the countable case. 

Here are some further properties of a construction Sq of S that are 
important. 

First, there is the property that Sq has exactly the same carrier sets 
as S and so Sq does not introduce new sets of data. 
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Secondly, there is the property that is a minimal algebra, which 
means all the elements of the algebra can be constructed from the op- 
erations of the algebra applied to the constants of the algebra. It is not 
difficult to show that any infinite countable n sorted algebra can be ex- 
panded into a minimal algebra by adding at most k < n + 2 functions.^ 
Note that system algebras are infinite since time T is assumed to be 
an infinite set. 

Consider the question in the case of countable systems using the strong 
notions of a computable algebra and an equational specification under 
initial and final algebra semantics. We can apply the Bergstra- Tucker 
theorem in the case of 6 sorts: 

System Characterisation Theorem. Let 

S = {X,A,T,P,S{T,P),S{X,A)\F) 

he a spatially extended deterministic system, and let Sq be any minimal 
construction algebra for S. Then the following are equivalent: 

(i) The system Sq is computable. 

(a) The system Sq can be uniquely characterised by 14 equations and 
21 hidden operators under initial algebra semantics and final alge- 
bra semantics. 

4 Abstraction and Approximation Between Spatially 
Extended Deterministic Systems 

A spatially extended system can be viewed at different levels of abstrac- 
tion. This leads to a portfolio of models of the system that analyse dif- 

^ There are two cases for the expansion of A to a minimal algebra A' as follows. 
Let As be a carrier of A. We say that carrier As is minimal if it is generated 
by the operations and constants of A; otherwise it is non-minimal. 

Suppose A contains an infinite minimal carrier As , then for each carrier A^ 
that is non-minimal we add to the operations of A a surjection ft '■ As ^ At 
to make A'; and so k = the number of non-minimal carriers. 

Suppose A does not contain an infinite minimal carrier but contains an 
infinite carrier As. Then we pick an element a of As and a function / : A^ — )• A^ 
that can generate the set As = {/"(a) | n = 0, 1,2, . . .} and add a and / to 
the algebra A to make A'. Thus, in the “worst” case that an infinite algebra 
A has no minimal carrier we need to add n— l-|-2 = n-|-l operations. 

If A is computable then it is possible to choose operations of the above 
form such that A' is computable. Such simple algebraic tricks may be used to 
ensure that a computable system construction algebra can be expanded to a 
minimal and computable system construction algebra. 
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ferent aspects with different degrees of detail. Some of these models may 
have clear relations and form a hierarchy. The purpose of a hierarchy is 
to better understand the system by analysing different properties at “ap- 
propriate” levels of detail. In particular, the hierarchy typically contains 
models of the spatially extended system at different scales of space, time 
and data. 

In computer science, the use of hierarchy is firmly established in both 
theory and practice. For example, if the system is a microprocessor then 
the levels range from the physical device to the programmer’s model of 
the machine. Furthermore, there are several layers of software that are 
built on top of the architecture ranging from operating systems to user 
interfaces. We think of the levels as independent autonomous systems 
that have increasingly complex relationships with one another as one 
approaches the physical system. We do not normally think of the user 
interface, the operating system or the architecture as simply imperfect 
approximations of the system. Modern computer science, in practice, is 
the study of everything but the “real” system (which is the province of 
electrical engineers and physicists). 

The same is true of physical systems. Here models of different aspects 
of systems are abstractions of a real physical system. Depending on prop- 
erties of the system that are known experimentally, more commonly they 
are described as simply imperfect approximations of the system. For ex- 
ample, cardiac tissue can be modelled at the cellular level — with different 
degrees of detail or accuracy — or more abstractly as wave propagation in 
an excitable medium — again with different degrees of detail or accuracy. 
Hierarchy is important in scientific practice but at present we know of 
relatively little to serve as a theory of hierarchical structure of mathe- 
matical models. The concepts of abstraction and approximation overlap 
in discussions of mathematical modelling. Here we use the term abstrac- 
tion except where the behaviour of a model is approximated by another 
to some known numerical bound. 

In this section, we will present concepts that define formally the re- 
lationship between two SESs at different levels of space, time, state and 
system behaviour. Later, we will use these ideas to develop a theory of 
hierarchical structure for our algorithmic models. 

Let 



= (Wi, A,Ti,Pi,5i(Ti,Pi),52(Xi, A) I Pi) 
P 2 = {X2,A2,T2,P2,Si{T2,P2),S2{X2,A2) \ P 2 ) 
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be spatially extended deterministic systems, and suppose that is a 
more detailed model or is at a lower level of abstraction than 82- First we 
define component abstractions, then we consider behaviour abstractions 
and approximations. 

4.1 Abstractions of Components 

We compare the spaces Xi and X2, the sets Ti and T2 of time points, the 
global state sets 5'i(Ai,Ai) and S2{X2-,A2) and the sets S'i(Ti,Pi) and 
S'2(T2,P2) of parameter streams. 

Space abstraction. A respacing between X\ and X2 is a surjective map 
7T : Ai — >■ X2. The intention is that each point x G X\ is abstracted in 
S2 by 7r(a;) e X2. Let 7 t“^ : X2 — >■ Powerset(Xi) be defined by Tr~^{y) = 
{x ^ X\ \ 'k{x) = y} so that 7r“^(y) C X\ is the set of all points in X\ 
abstracted by y e X2. Maps vr and 7 t“^ are illustrated in Figure 3. There 
are further natural properties of a respacing that we will not need here. 




Fig. 3. Abstraction between spaces X\ and A2. 



Time abstraction. A retiming between the sets Ti and T2 of time points 
is a surjective, monotonic map X : T\ ^ T2 with the intention that each 
time point t G Ti is abstracted in T2 by X{t). A retiming provides a 
simple temporal abstraction in which a time instant or cycle t G T2 is 
represented by the set C T\. Although this notion is not suitable for 
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the case that Ti = INf and T2 = tR"*", since the condition of surjectivity is 
impossible, it and its refinements are useful in many situations. A typical 
refinement is in the case that T\ and T2 are both IN or both IR"*", where 
we can say that T\ is faster than T2 if, for each interval C Ti, we 
have t' — t > \{t') — \{t). 



State abstraction. Global states are compared using a map of the form 



<f> : S'i(Xi, Ai) — )■ S'2(X2, A2) 



with the intention that the global state s C S'i(Xi, Ai) of is abstracted 
in S2 by 4 >{s) G S'2(X2, A2). 

A state abstraction mapping 4 > is spatially consistent with the respac- 
ing 7 T if, for any point y G X2, the abstracted local state (f>{s){y) at y de- 
pends only on the state of the subspace 7 r“^(j/) C X\ abstracted by y. For- 
mally, is consistent with vr if, and only if, for all states s, s’ G S'i(Xi, Ai), 
and all points y G X2, 

s{x) = s'{x) for all x G 7 r~^{y) (j){s){y) = 4 >{s'){y). 



Abstraction of observable states. Where we define sets Xi^obs ^ Xi 
and X2^obs ^ A2 of observable points in the two systems, we say that cf> 
is consistent with respect to observable states if, and only if, for all states 
s,s' G S'i(Ai, Ai), and all points y G X2^obsi 

s(x) = s'(x) for all x G Xi^obs => </>(s)(y) = (/>(s')(y). 

A state abstraction map f that is consistent with respect to observable 
states determines an observable state abstraction map 

4>obs ■ *S'l,ofes(Ai, Ai) — )■ S2,obs{X2, A2) 

as follows. Let s G S'i(Ai,Ai) be any global state of Si and let Sobs = 
^ ‘S'i,ofes(Ai, Ai) be its observable part. Then, for all y G X2,obs, 

(!>obs{sobs){y) = (!>{s){y). 



We say that fobs is the observable part of f. 
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Parameter abstraction. Parameter streams are compared using a func- 
tion of the form 

O : Si{Ti,Pl) ^ S2{T2,P2) 

with the intention that an parameter stream p G S'i(Ti,Pi) is ab- 
stracted in S 2 by 0{p) G 52 (T 2 , P 2 )- The function 0 is temporally consis- 
tent with the retiming A if, for all p,p' G 5i(Ti, Pi) and t G P 2 , 

p{c) = p' {c) for all c G ^ 0{p){t) = 0{p'){t). 

4.2 Abstraction and Approximation of Behaviour 

We now define how system behaviours are related using abstraction con- 
cepts for time, states and parameter streams. There are four basic no- 
tions dealing with abstraction and approximation for global and observ- 
able behaviour. We will define notions that are suited to our later case 
studies: abstraction of global behaviour and approximation of observable 
behaviour. It is straightforward to define the other two notions. 

Abstraction of global behaviour. The behaviour of system S 2 is an 
abstraction of S\ with respect to retiming A, parameter abstraction map 
0 and state abstraction map (j) if, for all t G T\, p G 5i(Ti,Pi) and 
sg5i(Wi,Ai), 



4>{Fi{t,p,s)) = P2(A(t),0(p),0(s)) 
or, equivalently, if the following diagram commutes: 

T2 X 52 (T 2 ,P 2 ) X 52(^2, AI2) ^ 



52 (^ 2 , A 2 ) 



A 



0 



Ti X 5i(Ti,Pi) X 5i(Wi,Ai)- 



Fi 



•5i(Ai,Ai 



Approximation of observable behaviour. Let Pi,o6s and P 2 ,obs be 
the observable state functions of Si and S 2 respectively. Let 

d : S2,obs{X2,A2) X S2,obs{X2,A2) ^ 1R+ 

be a metric that compares two observable states of system 82 - We say that 
the observable behaviour of S 2 approximates that of Si with respect to 
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retiming A, parameter abstraction map 0 , state abstraction map <f> with 
observable part 4 >obsi state comparison metric d and tolerance e e IR"*", if 
for all t e Ti, p e 5 'i(Ti, Pi) and s C S'i(Xi, ^i), 

d{(j)obsiFi,obsit,p,s)),F2^obsiMt)^^ip)^Hs))) < £• 

Generalisations of abstraction and approximation notions. It is 

often the case that Si exhibits behaviours that are not abstracted or ap- 
proximated by 82- For example, a detailed model Si of a physical system 
might display complex spatio-temporal behaviours that are observable at 
the abstract level (via a state abstraction map 0 ), but are not reproduced 
by a simpler model S2 of the same physical system. 

One method to deal with such cases is to consider 0 as a function 
that maps states of Si onto partial states of S2, where if (f){s){y) I then 
(f){s){y) e A2 is the local state of y e X2 abstracted from s C S'i(Xi, ^1), 
and 4 ’{s){y) f denotes that the state s is not abstracted by 4 > for S2 point 
y. Let 



S 2 ,p{X 2 , A2) — {sp e [X2 -A A2] I 

( 3 s e S2{X2,A2)){yy e X2)[sp{y) = s{y) or s{y) f]} 

be the set of all partial global states of S2 (where A- denotes partial 
functions) and let 0 : S'i( 3 fi,^i) — >■ S2,p(X2, A2) be a state abstraction 
map. Let domp{4>) C S'i(Afi,^i) be the set of all Si states that map to 
total S2 states under 4>. 

To be able to define notions of observable approximation or abstrac- 
tion using 4> it is sufficient to require that domp{4>) is non-empty and that 
its observable part 4>obs always maps to fully defined states (that is, it 
must be of the form : S'i^ofc(Xi, ^i) 5 ' 2 ,o 6 s(^ 2 , ^2))- 

It is often also the case that the set S'i(Ti,Pi) of parameter streams 
for Si might contain elements that are not abstracted by elements in the 
set S'2(T2,P2)- In such a case we allow 0 to be a partial function with 
domain dom{0) which we require to be non-empty. 

We say that the observable behaviour of S2 approximates that of Si 
with respect to retiming A, global state abstraction map <f> with observable 
part 4>obs that maps only onto total states, parameter stream abstraction 
map 0 , state comparison metric d and tolerance e e 1 R“*“, if for all t e Ti, 
p e dom{0) and s e domp{4>), 



^(0o6s (-FljOfes (t, P) '®) ) ) F2 jo6s ('^(t) , 0(p) , </*('S)) ) < £. 
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Further, we may consider the case that S 2 abstracts or approximates 
Si only for a subset sub{Ti) C T\ of time points (typically, for an ini- 
tial segment, or at regular time points) and on subsets suh{Ti,Pi) C 
Si{Ti,Pi) of parameter streams and sub{X\^ Ai) C S'i(Xi,^i) of initial 
states. For example, we say that the observable behaviour of S 2 approxi- 
mates that of S\ with respect to the abstraction maps, the above subsets, 
the state comparison metric d and tolerance e G IR"*", if the above inequal- 
ity holds for all t G sub{Ti), p G sub{Ti, Pi) and s G sub{Xi, ^ 1 ). 

5 Synchronous Concurrent Algorithms 

We now turn our attention to a concrete model of computation for spa- 
tially extended systems. Synchronous concurrent algorithms (SCAs) are 
algorithms distributed discretely in space and operating in discrete time. 
The concept of an SCA was introduced in 1985 to model parallel deter- 
ministic computing systems, especially hardware; see [TT94]. Many math- 
ematical models of physical and biological systems have also been shown 
to be SCAs, including cellular automata [HTT91], coupled map lattices 
[HTZP92,HPTZ94,HPT96a], neural networks [HPT96b], and discrete ap- 
proximations of PDFs and CODEs. SCAs are easily seen to be equation- 
ally specifiable and computable; SCA theory is built upon the theory 
of primitive recursive functions over many-sorted algebras [Tuc91,TZ92]. 
SCAs have also been studied using process algebra [BHP97,BP99]. 

In this section we define the simplest type of SCA which has a single 
clock. In Section 5.2 we relate the general notion of an SCA to that of an 
SES. We present two examples of SCAs in Sections 5.3 and 5.4: a systolic 
array and a model of electrical activity in a strand of cardiac tissue. 



5.1 Formal Definition 

An SCA N is characterised by its global clock, the data it processes, 
and its architecture of modules, channels and sources. For simplicity, we 
describe SCAs where modules compute on a single data set, although it 
is not difficult to extend the definition to the the many-sorted case. 



Data and time. We assume that the algorithm N computes over data 
from a non-empty set A and is synchronised by a discrete clock T = 
{ 0 , 1 , 2 ,...}. 




Hierarchies of Spatially Extended Systems and Synchronous Concurrent Algorithms 207 



Modules and channels. Let I be a finite non-empty set of modules. A 
module is an atomic computing device capable of some specific internal 
processing. Let each module i C / have p{i) inputs and a finite set 

Chi = . . .} 

of output channels, where u,v,. . . are identifiers chosen from a set Var. 
Channels have unit bandwidth with respect to A (i.e., at any time they 
hold a single datum) and are unidirectional. Let Ch denote the set of all 
network channels, and is defined by 



Ch = [j Chi. 




Let each module i C / compute a function 

^ A 

for each of its output channels {i,v) C Chi with the intention that if 
values h\,. . . , hp^^i'^ e A arrive on its p{i) inputs (each of which may be a 
channel from another module, or an input source supplying data from the 
external environment) then module i outputs the value fi,v{bi , . . . , ^p(i)) 
along channel (i,v). For clarity, when the relationship between module 
functions and channel names is well understood, we might denote fi^^ by 
/i) fi,u by gi etc.. 

Sources and output channels. An SCA N may operate on infinite 
sequences or streams of data from the set [T — >■ A]. Let In be a finite 
(possibly empty) set of network inputs or sources. Each source supplies 
the network with a single stream of input data. An SCA without sources 
is termed closed. 

Network N processes a set of input streams of the form a e [T — >■ A]^” 
where a(i) e [T — >■ A] (often written a^) is the stream supplied by source 
i, and ai{t) e A is the value supplied by source i at time t. 
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Data is read from N at its output channels. Let Out C Ch he a set of 
all network output channels. 



Architecture. The architecture of a network N is its topological struc- 
ture of modules, channels and sources. We define two wiring maps to 
formalise the connections between modules, channels and sources. Let 

a : / X IN A- {source, channel} 

p : I xTN ^ InUCh 

be partial functions that enumerate inputs to modules in the following 
way: for any module i ^ I, for j G {1, . . . ,p(i)}, a{i,j) says whether 
the j-th input connection to module i is from a source or a channel, and 
P{i,j) gives the index of that source or channel. If j ^ {1, . . . ,p{i)}, then 
a{i,j) and P{i,j) are undefined. 

For each module i, let 

snhd{i) = iP{i,j) | j G {1, . . . ,p{i)} and a{i,j) = source} 

be the source neighbourhood of i comprising all sources that supply i with 
data. Similarly, let 

cnhd{i) = iP{i,j) | j G {1, . . . ,p{i)} and a{i,j) = channel} 

be the channel neighbourhood of i comprising all channels that supply i 
with data. 

SCA Equations. An SCA N as described above computes on streams 
a e [T — >■ from a state s e of initial channel values, where 

s{i,v) e A (often written Si^P} is the initial value held on channel {i,v). 
For each channel {i,v) G Ch we define a channel state function 

: T X [T ^ 

where a, s) denotes the datum held on {i,v) at time t when the 

network is executed on input streams a from initial state s. We define 
by induction on the clock T, as follows: 

Time 0. At time 0, each channel is initialised with the datum Si^^ £ A; 
thus 



a, sj 




Hierarchies of Spatially Extended Systems and Synchronous Concurrent Algorithms 209 



Time t + 1. The value held by channel (i, v) at time t + 1 is evaluated by 
module i with function fi^y : —>■ A. If , bp(^i^ are the values held 

on the inputs of module i at time t, then the value of (i, v) at time t + 1 
is . . . , The channels and sources bearing the input values 

6i, . . . , are determined by the wiring maps a and (3. Thus, 

Vi^y (t + 1 , n, s) fi^y (bl, . . . , ) 

where for j C {1, . . . ,p(i)}, 

^ if a(i,j) = source 

^ I “) ■*) if o;(i, j) = channel. 

Global state functions. The channel state functions Vi^y tell us the value 
of a particular channel, given a time, input streams and initial data. We 
may combine these functions into a global state function 

V -.Tx[T ^ AY^ X A^'^ A^^ 

that gives the state of the entire network. The function V is defined for 
alH e T, a e [T AY^ and s e A^^, by 

V{t,a,s) = (Vi^y{t,a,s) \ (i,v) e Ch). 

Output functions. The global state function tells us the values at all the 
network’s channels. Restricting this to output channels gives us an output 
function 

Vout :T x[T ^ AY^ X A^^ ^ 

where VoutY-, s) is the output of N at time t given input streams a and 
initial data s. We define Vouti for alH C T, a C [T — >■ and s e A^^, 

by 

Vout{t,a,.s) = {Vi^y{t,a,.s) \ {i,v) e Out). 

It is often useful to regard iV as a stream transformer. We define a stream 
transformer version 

Vout :[T ^ AY^ -^[T ^ A]^^^ 

of Vout with coordinates Vout,(i,v) for all output channels {i,v) C Out 
defined for all a G [T — s G A'^^ and t G T, by 

^out,{i,v){(^T ^)Y) Vi^uY^Cl^s) 

such that Vout{ct,s) is the Owf- indexed set of streams that are output 
from N given input streams a and initial state s. 
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5.2 SCAs as Spatially Extended Systems 

An SCA is essentially a spatially extended system with a discrete clock 
whose system function is definable by primitive recursive equations of the 
form of Section 5.1. 

Identifying elements of a general SCA with the space, state, parameter 
stream, system function, and observable state function components of a 
spatially extended system is straightforward. 

We can identify either the set Ch oi channels or the set I of modules 
with space. In the former case, each point (channel) in the space has 
a single state at any time instant. In the latter each point (module) is, 
in general, multi-valued where each output channel is identified with a 
component of the point’s local state. In both cases the set of global state 
coordinates is given by Ch. In the remainder of the paper, we identify 
the set I with space, although it is not difficult to reformulate the ideas 
using Ch as the space set. 

The set of possible global states of the system is A^'^. The set of 
parameter values is and the set of all possible parameter streams is 
[T — >■ although we often prefer to use the equivalent set [T — >■ A]^”. 

The SCA’s global state function V is identified as the system function. 
The set Out C Ch of output channels is identified as the set of observable 
state coordinates, and Vout as the observable state function. 

5.3 Example: Systolic Convolver 

We consider a first example of an SCA: a systolic convolver introduced 
in [Kun82]; convolution has many important applications in digital signal 
processing. In this section we define the task of convolution, and specify 
the convolver as an SCA in the style of Section 5.1. The algorithm has 
been studied as an SCA in [HTT88], where its correctness was proven. 

Task. Let A be a commutative ring and let r = (ri, . . . , r„) e A” be an 
n- vector which we call a reference word. Let convr : A” — >■ A be the inner 
product or convolution function defined with respect to r, for all n-vectors 
a = (ai, . . . ,a„) e A”, by 

conVr{ai , . . . , «n) = ri • ai H h • a„. 

(For simplicity, we will assume n > 2). Let C = {0, 1,2,.. .} be a clock. We 
specify formally the task of convolution by means of a stream transformer 



^ : [C — )■ A] — [(7 — )■ (A U {unspec})] 
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defined, for all a e [C — and c C (7, by 

^ . f unspec if c < n 

[ conVr[a[c — nj, . . . , a[c — Ijj if c > n 

so that the value output at time c is the inner product of r and the 
previous n values of the input stream. The value unspec ^ A means 
“unspecified” and is used here to denote the fact that we are not concerned 
with the specification’s output until time n. We illustrate ^ in the table 
below. The first two columns give the output for the general specification. 
The final two columns give the output of the specification in the case that 
A = 7Z^ n = 3, r = (1, 2, 3) and where the input stream a ■. C ^ A begins 
with elements a(0) = 2, a(l) = 1, a(2) = 3, a(3) = 1, a(4) = 3, a(5) = 
2, a(6) = 3, .... 



c 


Ha){c) 


c 


<l>(a)(c) 


0 


unspec 


0 


unspec 


1 


unspec 


1 


unspec 


2 


unspec 


2 


unspec 






3 


13 


n — 1 


unspec 


4 


10 


n 


convr{a{0 ), . . . , a{n — 1))) 


5 


14 


n + 1 


conVr(a(l), . . . , a(n))) 


6 


13 


n + 2 


conVr{a{2 ), . . . , a{n + 1))) 


7 


16 



Algorithm. Consider the network illustrated in Figure 4. 



src 




Fig. 4. A systolic convolver. 



During every clock cycle, each module operates as follows. Module 1 
reads some value a from the network’s input source src, and some value b 
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from channel (2, v); it outputs the value • a + 6 onto channel (1, t) and 
a onto channel (1, -u). Module j, for j = 2,...,n — 1, reads two values a 
and b from channels (j — 1, «) and (j + 1, t) respectively, and outputs the 
value Tn-j+i-a + b onto channel (j, v) and the value a along channel (j, u). 
Module n reads some value a from channel {n—l,u) and outputs the value 
ri • a onto channel (n,v). Elements from the single input stream travel 
along the right-facing channels (1, m), (2, -u), ... of the network where they 
are multiplied by each element rj of the reference word r. Partial results 
travel from right to left via channels (n,r>), (n — l,r>), . . . and appear as 
inner-products at the output channel (1 ,p). 

Elements of a stream a : C ^ A must be supplied by the network 
source only at even numbered clock cycles; padding values (e.g., 0) should 
be supplied at odd cycles. We define the algorithm to compute with re- 
spect to a network clock T, which is twice as fast as the specification’s 
clock C. Prom the example stream a : C ^ A above, we can define a 
network input stream a' : T ^ A padded with alternate O’s, that begins 
with a'(0) = 2, a'(l) = 0, a' (2) = 1, a'(3) = 0, a' (4) = 3, a'(5) = 0, . . .. 
Taking A = 7Z^ n = ?>, r = (1,2,3), and initial channel states to be 0, 
we trace the execution of the convolver on the network input stream a' in 
the following table, where the upper row for each time gives the values on 
the two u channels and the lower row gives the three v channels’ values. 
Notice that we obtain results (shown underlined) from the convolver at 
times 2n — 1 = 5, 2n -|- 1 = 7, 2n -|- 3 = 9, . . .. 



t 


a'{t) 


1 


2 


3 


0 


2 


0 


0 


- 






0 


0 


0 


1 


0 


2 


0 


- 






6 


0 


0 


2 


1 


0 


2 


- 






0 


4 


0 


3 


0 


1 


0 


- 






7 


0 


2 


4 


3 


0 


1 


- 






0 


4 


0 


5 


0 


3 


0 


- 






13 


0 


1 


6 


1 


0 


3 


- 






0 


7 


0 



t 


a'{t) 


1 


2 


3 


7 


0 


1 


0 


- 






10 


0 


3 


8 


3 


0 


1 


- 






0 


5 


0 


9 


0 


3 


0 


- 






14 


0 


1 


10 


2 


0 


3 


- 






0 


7 


0 


11 


0 


2 


0 


- 






13 


0 


3 


12 


3 


0 


2 


- 






0 


7 


0 


13 


0 


3 


0 


- 






16 


0 


2 
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We use the basic SCA model to formalise the convolver and its oper- 
ation over a commutative ring A with respect to clock T. 

Channels and Modules. The index sets for the modules, sources, channels 
and output channels are easily determined from Figure 4: 

/ = {1, . . . ,n} 

In = {src} 

Ch = {(!,«), (l,i;), (2 ,m), (2,ri), . . . , (n - 1, u), (n - l,v), {n,v)} 

Out = {(1, r>)}. 

Module functions. For modules j = 1, . . . , n — 1, we define gj : —>■ A 

and fj : A“^ A, associated with channels (j, u) and (j, v) respectively, 
for all a,b ^ A, by 

gj{a,b) = a and fj{a,b)=rn~j+i-a + b. 

For the rightmost module n we define fn-A^A, associated with channel 
(n, v), for all a C A, by fn{a) = ri ■ a. 

Architecture. Let a and j3 be wiring maps that define the following neigh- 
bourhoods in a straightforward manner: 

snhd{l) = {src} 

snhd{j) =0 j = 2, . . . , n 

cnhd{l) = 1(2, r)} 

cnhd{j) = {(j - l,u), {j + l,v)} j = 2, . . . ,n - 1 
cnhd{n) = |(n — 1, i/)}. 

SCA equations. The formal specification of the convolver’s components 
determines channel state functions 

Vj^u, :T x[T ^ A] X A^'^ ^ A 

for each (j,w) e Ch, defined for all a C [T — >■ A] and s C A*^^, at time 0 
by a, s) = s{j, w), and at time t -|- 1 as follows: 

Vi^u{t -h 1, a, s) = gi{a{t), V 2 ,v{t, a, s)) 

-h 1, a, s) = fi(a(t), V 2 ,v{t, a, s)) 

V 2 ,u{t -h 1, a, s) = g 2 {Vi,u{t, a, s), V^^^{t, a, s)) 

V 2 ,v{t -h l,a,s) = f 2 {Vi,u{t,a,s),V^^^{t,a,s)) 
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^n—l^u 1 , ft, 5) — 9n— 1 iy^n—2,u (^5 , Vn,v (^5 5 )) 

^n—l^v “t“ 1 , ft, 5) — fn — \ (^Vn — 2 ^u (^5 -s) , Vji^v (t^ Oj^ 5) ) 

^n,v (t + 1 , ft, s) — fn{^n — l,u '®) ) • 

Stream transformer specification. We specify the input-output behaviour 
of the convolver as a stream transformer 

V^t :[T^A]x A] 

defined, for all ft G [T — ;• ^], s G A'^^ and t G T, by 



Algorithm correctness. We now consider the correctness of the con- 
volver, specified by Vout^ with respect to the task specification Since 
the algorithm computes with respect to clock T rather than (7, on padded 
input streams, and supplies valid output only at times 2n — 1, 2n -|- 1, . . ., 
we need to schedule streams into and out-from the algorithm. 

We define an input scheduling function 9in : [(7 — >■ A] — >■ [T — >■ A] , for 
all specification input streams a : C ^ A and network clock cycles t ^ T, 
by 




ft(t/2) 

0 



if t even 
if t odd. 



We define an output scheduling function Oout : [T — >■ A] — >■ [(7 — >■ (A U 
{unspec})], for all network output streams ft G [T — >■ A] and specification 
times c G (7, by 



e 



out 




unspec 
ft(2c- 1) 



if c < n 
if c > n. 



We consider the convolver to be correct with respect to <P, if for all spec- 
ification input streams ft G [(7 — >■ A] and initial network states s G A*^^, 
at each time c ^ C, 



(^in (®) 5 'S) ) (c) — *?(ft)(c). 



See [HTT88] for a proof of correctness. 
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5.4 Example: Model of Cardiac Tissue 

We now consider an example SCA from biology: a model of electrical 
behaviour in cardiac tissue. The SCA is derived from a biophysically 
detailed, high-order ODE model of a single guinea pig ventricular cell 
(see [Nob90]) that provides a quantitative description of the intracellu- 
lar processes and ionic concentrations and currents that determine the 
cell membrane potential. It has 17 dynamic variables, with kinetics de- 
rived from voltage clamp experiments. It is an example of a number of 
biophysical excitation equations for heart muscle [PH97]. 

Linking together copies of this model into a one-dimensional coupled 
ODE (CODE) lattice reconstructs a strand of tissue (where the coupling 
between ODEs represents the cell-to-cell junctional conductance). The 
CODE lattice itself is not an SCA, but many algorithmic approximations 
to the model are SCAs. As a full description of the biological details of the 
model itself would be too involved for our purposes, we will consider only 
the essential details of an SCA approximation, derived using the finite 
difference method. 

Consider the figure below, where each box represents a single module 
(cardiac cell) with nearest neighbour connections, a local source supplying 
a stream of electrical stimuli, and an observable output channel. The 
network computes on the set IR of real numbers. 

I I 

|rt— 2|n— 1| n 

i i i 

Modules and Channels. We use the set I = {1, . . . , n} to index both the 
modules and the stimulation sources. Each module has 17 output chan- 
nels which hold its local state. Channel values include representations 
of membrane potential or voltage, and various ionic concentrations and 
processes. Details concerning these states are not relevant to the present 
discussion; we only need bear in mind that the model encompasses con- 
siderable biological detail. We will use the set 

Chi = {*} X ({'w} U dyn) 

to index the channels of cell i where v denotes voltage and dyn denotes 
a set of 16 names used to identify each of the other dynamic states. The 
sets Chi determine a set Ch of 17n network channels. 
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Of primary interest are the voltages of the cells across the network, 
and thus we define the set Out of observable or output channels to be 

Out = (2,^;), . . . , {n,v)}. 

Module functions. At each clock cycle, every cell computes on its 17- 
valued state, the voltages of its two nearest neighbours (boundary cells 
1 and n each have only one nearest neighbour), and on the current local 
stimulation value. The module functions, determined by a finite difference 
approximation technique, thus take the form 

: IR^^ X IR^ X IR — >■ IR i = 2,...,n— 1 

, • • • : IR^^ xlRxlR— >-lR i = l,n. 

Architecture. Let a and j3 be wiring maps that define the following neigh- 
bourhoods in a straightforward way: 

cnhd{i) = Chi U {(i — l,v), {i + 1, p)} i = 2, . . . , n — 1 
cnhd{l) = Chi U {{2,v)} 
cnhd{n) = Chn U {(n — 1, t)} 

snhd{i) = {i} i = 1, . . . ,n 

so that each module i has feedback connections from all its output chan- 
nels (Chi), connections from its neighbours’ voltage channels {{i — l,v) 
and {i + 1,t), unless i is a boundary module which has only one such 
connection) and a connection from its local source i. 

SCA equations. The formal specification of the model’s components de- 
termines channel state functions 

: T X [T ^ IR]^ X IR^'* ^ IR 

for each (i,w) e Ch, defined for all stimulation streams a e [T — >■ IR]^ 
and initial states s e IR*^^, at time 0 by Li,tu(0, a, s) = s{i, w) and at time 
t -|- 1 as follows: 

Lt,tu(^Tl,Cl,s) fi,w{^i,v (L®)'®)) • • • ) ^i+l,v (L 0,1 ■s) j (^) ) 

for i = 2, . . . , n — 1, and 

Vi,w{t + l,a, s) = fi^y,{Vi^^{t,a,s ), . . . , s), «i(i)) 

{t T l,n, 5 ) fn^w {ti sf . . . , V^—i^y (t, n, 5 ) , a^i (t) ) • 
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Observable behaviour. We define the observable (i.e., voltage) behaviour 
of the CODE model as a function 

Vout : T X [T ^ IR]^ X IR^'* ^ 1R®“* 
for alH e T, a e [T — )• IR]^ and s G IR*"^, by 

Vout{t, a, s) = a, s), V 2 ,v{t, a,s) , Vn,v(t, a,s)). 

To demonstrate the observable behaviour of the CODE model, con- 
sider a system of n = 2000 coupled cells which, taking SO^im as the length 
of a cardiac cell, represents a 160mm strand of tissue. This unreasonably 
long strand is to allow for illustration of the fully spatially extended trav- 
elling wave; in reality a wave propagates in a medium that is smaller than 
its wavelength. Each clock cycle in the model represents 0.01ms of real 
time, which gives a numerically stable solution to the CODE. Figure 5 il- 
lustrates snapshots of an action potential propagating along the system at 
50ms (5000 clock cycle) intervals following a 30n^ stimulation of the four 
left-most cells for an initial period of 2ms (given by parameter streams 
= “30 for 1 < i < 4 and t < 200; ai{t) = 0 otherwise), given that 
the model begins in a uniformly resting state (given by an appropriate 
value of s). Figure 5 illustrates only one aspect of the biological detail 
reconstructed by the CODE model; we could easily trace the values of 
each of the other 16 state coordinates at any cell. 

6 Abstraction and Approximation Between SCAs 

Suppose we have two SCAs N\ and N 2 with components Ii, I 2 , 0 / 1 - 1 , Ch 2 
etc.. We wish to compare the operation of the SCAs and define formally 
the notion that the behaviour of N 2 (given by V 2 ) is an abstraction or 
approximation of the behaviour of N\ (given by Vi). In Section 6.1, we 
show how components of N\ can be related formally to those of N 2 . Then, 
in Section 6.2, we present a formal definition of behaviour abstraction for 
SCAs. The component and behaviour abstractions and approximations 
for SCAs are special cases of those for general spatially extended systems 
given in Sections 4.1 and 4.2. 

6.1 Comparing Components of SCAs 

We compare the finite spaces or module sets I\ and I 2 , the discrete clocks 
Ti and T 2 , the sets and of global states, and the sets [Ti — >■ 

AiY^'^ and [T 2 — )■ ^ 2 ]^”^ of input streams. 
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Fig. 5. An action potential travelling from left-to-right along the CODE model 
following a single stimulus at the four left-most cells. Voltage is plotted against 
space at 50ms intervals. 



Spaces. We begin by comparing the sets I\ and I2 of modules. An SCA 
respacing 

vr : /i — )■ /2 

is a surjective function, with the intention that each N\ module i e 
I\ is abstracted in N2 by module 7r(i) e I2. The inverse 7 t“^ : I2 — >• 
Powersetf{Ii) of vr is defined by = {i e Ii | 7r(i) = j} where 

is the subspace of all N\ modules abstracted in N2 by module 
j & h- Figure 6 illustrates an SCA respacing vr and its inverse 



Clocks. Next consider the SC As’ global clocks Ti and T2. An SCA re- 
timing 

A : Ti ^ T2 

is a surjective, monotonic function with the intention that each clock cycle 
t e Ti abstracts clock cycle \{t) e T2. Prom a retiming A, we determine 
an immersion A : T2 — >■ Ti defined, for all t e T2, by 

\{t) = min c e Ti such that A(c) = t. 

Let the range of A be denoted Start x C Ti; this set comprises clock cycles 
of Ti that correspond with the “beginning” of each cycle of clock T2. The 
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h 



7T 

h 

Fig. 6. An SCA respacing tt : A — > /2 and its inverse tt ^ I 2 ^ Power set 

inverse : T 2 — >■ Powersetf{Ti) of A is defined, for all t e T 2 , by 

A ^{t) = {A(f), \{t) + 1, . . . , \{t + 1) — 1}. 

Often, we are interested in expressing the idea that clock Ti is r times 
faster than clock T 2 for some r e IR where r > 1. This is accomplished 
by means of a linear retiming, which takes the form 

\{t) = - 
Ir 

The linear retiming for the case r = 2.5 is illustrated in Figure 7. 

Global states. An SCA global state abstraction map is of the form 

(f) : Af ^ 

r^h 

with the intention that a global state s e A^ ^ of SCA N\ is abstracted 
in SCA N 2 by 4>{s) e 

The map 4> is spatially consistent with vr if, and only if, for all states 
s,s' e A^^^ and channels {j,u) e Cli 2 , 

s{i,v) = s'{i,v) for all i e 7r“^(j),r> e Chi 

The map <f> is consistent with respect to observable states if, and only 
if, for all states s,s’ e A^^^ and output channels (j,u) e Out 2 , 

s{i,v) = s'{i,v) for all {i,v) e Out\ 4>{s){j,u) = 4>{s'){j,u). 





220 



M.J. Poole, A.V. Holden, and J.V. Tucker 




Fig. 7. The linear retiming A(t) = • Here the immersion A : T 2 — >■ Ti is 

given by A(0) = 0, A(l) = 3, A(2) = 5, . . . and the set Start\ is dehned by 
Start\ = {0, 3, 5,8,.. .}. 



Given a map 4 > that is consistent with respect to observable states, we 



, j^Outl 
J^utl 



Ai: 



Out2 



as 



can derive an observable state abstraction map cpobs 
follows. For any state s G , let Sobs = s\outi £ denote that 

part of s that is observable or output. Let cpobs be defined, for all output 
channels (j,u) e Out2, by 



0o6s ('Softs) (jj '^) 



Input streams. We map input streams for SCA A^i onto input streams 
for N2 by means of a stream abstraction function 

0 : [Ti ^ [T 2 A2Y^^ 

with the intention that streams a G [Ti — >■ for SCA iVi are ab- 

stracted in SCA N2 by 0 (a) e [T2 — >■ 

Let &j : [Ti — >■ — >■ [T2 — >■ A2] be the j-coordinate of 0 , which 

abstracts streams for the N2 source j e In2- The map 0 is temporally 
consistent with the retiming A if, for all a, a' e [Ti — >■ t e T2 and 

j e Ih 2 , 

ai{c) = a'Yc) for all i e In\,c e A“^(t) =A 0 j{a){t) = 0 j{a'){t). 

It is often the case that each source of Ni is abstracted by a sin- 
gle source in N2; such a relationship is expressed by a surjective source 
abstraction map rj : In\ — >■ Iu2 defined with the intention that data 
supplied by a source i e In\ is abstracted by data supplied by source 
rj{i) e In2- The inverse rj~^ : Iu2 — >■ Powersetf{Ini) of r] is defined 
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by £ A I = j} where rj~^{j) is the set of N\ sources 

abstracted in N2 by source j e In2- 

We say that 0 is consistent with both A and rj if for all a, a' e [Ti — >■ 
, t e T2 and j G Iu2, 

ai(c) = a[{c) for all i G G A“^(t) 0j{a){t) = 0j{a!){t). 

6.2 Notions of Abstraction and Approximation for SCAs 

For each of the notions of abstraction and approximation of behaviours 
for spatially extended systems given in Section 4.2, we can define special 
cases for SCAs. As an example, consider abstraction of global behaviour 
for SCAs. 

Typically, we wish to express the notion that the state of N 2 at each 
time t G T 2 abstracts that of N\ not at all clock cycles A(t),A(t) + 
1, . . . , \{t + 1) — 1 G Ti that t abstracts, but only at the first cycle A(t). 
The set of all of such clock cycles is given by Start The notion of SC A 
abstraction now follows directly from Section 4.2: we say that the global 
behaviour of N 2 is an abstraction of that of Ni if the following diagram 
commutes: 

Vo 

T 2 X [T2 ^ A2Y^^ X ^ 



A 






A 


0 


0 

T/ 



Start X X [Ti ^ AiY^^ x Af'*^ ^ Af'*^ 

6.3 Example: Abstraction Between Systolic Convolvers 

In this section we consider a bit-level implementation of the systolic con- 
volver of Section 5.3, and define formally the notion that the behaviour 
of the first SCA abstracts that of the bit-level algorithm. 

A bit-level convolver. We begin the specification of the new SCA by 
defining a bit-level implementation of the ring A over which the convolver 
of Section 5.3 computes. We shall assume that A can be represented by k- 
bit words for some k > 0. Specifically, we will assume the existence of an 
algebra ({0, 1}^ | rf , . . . ,r^, +^) and an epimorphism h : {0, 1}*^ — >■ 

A from this algebra to the algebra (A | ri, . . . , r„, •, -|-) comprising the 
reference word elements and operations • and -|- used in the construction 
of the abstract convolver in Section 5.3. 
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n — 1 n 



Fig. 8. A bit-level implementation of the convolver of Section 5.3. 



Consider the SCA depicted in Figure 8 where each dotted box j = 
1, . . . ,n comprises those modules that implement module j of the first 
convolver; each module has k bit-valued output channels, and there are 
k sources (for clarity, we have used k = 3 in Figure 8). Each d module 
computes the identity function on {0,1}^, and hence serves as a delay 
module; each p module computes +^; and each module rrij computes 
r^-j+i ■ o, given input a e {0,1}^. (By the terminology of Section 5.1, 
each module actually computes k coordinate functions, one for each of its 
output channels.) 

The relationship to the first convolver is straightforward. Each k- 
vector of channels that pass from box j to j + 1 represents channel (j, u); 
those that pass from dotted box j to j — 1 represent (j,v); the k output 
channels represent (1 ,t); and the k sources represent the single source 
src. Data entering a box at time t determines data output from the box 
at time t+2. Thus each clock cycle of the abstract convolver is represented 
by two clock cycles: input stream and channel values of the abstract con- 
volver at times 0, 1, 2, . . . , t, . . . are represented on the appropriate sources 
and channels here at times 0, 2, 4, . . . , 2t, . . .. 

To distinguish clearly between the two systems and their components, 
let us denote the bit-level convolver by Ni and the abstract convolver by 
N 2 . We rename the components T, I, In, Ch and Out of Section 5.3 by 
T 2 , I 2 , Iu 2 , Ch 2 and Out 2 - 

Using the sets 



h = {dj,i,dj^ 2 ,dj, 3 ,Pj,mj | j = 1, . . . , n - 1} U 

Ini = 
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Chi = M e Ii} 

Outi = {(pi,l),...,(pi,fc)} 

to index the modules, sources, channels and output channels of the bit- 
level convolver, we define channel state functions 

V,,r : Ti X [Ti ^ {0, 1}]'= X {0, ^ {0, 1} 

for each channel (i,r) e Ch\ in terms of vectorised functions 

Vi-.TiX [Ti {0, 1}]^ X {0, ^ {0, 1}^ 

for each module i C /i, where 

Vi{t, a, s) = a,s),..., Vi^k{t, a, s)) G {0, 1}^ 

denotes the k-hit word held on the output channels (i, 1), . . . (i, fc) of mod- 
ule i at time t. For all a G [Ti — )■ {0, 1}]^ and s G {0, let 

Vi(0,a, s) = (s(i, iG/ 

^di,i • • • , ak{t)) 

1 (t + 1, a, s) = {t,a,s) j = 2, . . . , n - 1 

(t + 1 (t:a,s) J = 1, . . . , n - 1 

{t + 1, a, s) = (t,a,s) j = 1, . . . , n - 2 

Vd„-i,3(t + l,a,s) = Vd„{t,a,s) 

Vdn(t -h l,a,s) = Vm„{t,a,s) 

Vrmit + l,a,s) = (ai(t), • • • ,afe(t)) 

V^. {t + 1, a, s) = r^_j+i (t: a,s) j = 2,..., n 

Vp. {t + l,a,s) = Vm^ {t, a, s) 3 (t,a,s) j = 1, . . . , n - 1. 

Component abstraction. To compare the behaviours of iVi and N 2 we 
first define mappings between their components. 

Spaces. The respacing vr : Ii — >■ I 21 illustrated by the dotted boxes in 
Figure 8, is defined by 

= T^{dj,2) = T^{dj,3) = vr(mj) = 7r{pj) =j 1 < j < n 
7r{dn) = 7r{mn) = n. 

Clocks. Each clock cycle of the abstract convolver N 2 is represented by 
two clock cycles of the bit-level algorithm Ni; we thus define a retiming 
A : Ti — >■ T 2 , for all t G Ti, by \{t) = [t/2\ . The corresponding immersion 
A : T 2 — >■ Ti is defined, for all t G T 2 , by A(t) = 2t, and Start \ is defined 
by Start\ = {0, 2,4,.. .}. 
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Data. Using the data abstraction map h, we define a space consistent 
global state abstraction map 0 : {0, for all N\ states 

s e {0, as follows: 

4>{s){j,u) = h{s{dj^ 2 , 1), • • . ,s{dj^ 2 ,k)) j <n 
= h{s{pj, 1), . . .,s{pj,k)) j <n 
(f){s){n,v) = h{s{dn, 1), . . . ,s(dn, k)). 

Input streams. We define an input stream abstraction map 0 : [Ti — >■ 
{0, 1}]*' — >■ [T 2 — >■ A], for all a e [Ti — >■ {0, 1}]^ and t e T 2 , by 

0(a)(t) = h{ai{2t), ..., ak{2t)). 

We see that the data supplied by the k sources of Ni at even valued 
clock cycles is abstracted using the data abstraction map h. This map is 
consistent with the retiming A and the (trivial) source abstraction map 
r] : In\ — >■ Iu 2 defined for all i e In\ by rj{i) = src. 

Abstraction of global behaviour. We now formalise the notion that 
the global behaviour of the first convolver abstracts that of the bit-level 
convolver N\ defined above. Let Vi and V 2 be the global state functions 
determined from the channel state functions of these two SCAs. It is 
possible for the correctness of this SCA abstraction to be proven mathe- 
matically. 

Theorem. The global operation of N 2 abstracts that of Ni with respect 
to the component abstraction maps defined above for any initial state 
and input streams, at all clock cycles in Start\ = {0, 2, 4 . . .}; that is, the 
following diagram commutes: 

Uo 

T 2 X [T 2 ^ A] X ^ A^'*^ 



/ 


\ 1 










A 


0 


0 




Sta 


rtx X [Ti ^ i 


[0,1}]^ x{0,l 


iC/ii — 


Ui 

^{0,1} 



6.4 Example: Approximation Between Models of Cardiac 
Tissue 

In this section we define a second model of the electrical activity of the 
heart, and compare its behaviour with that of the CODE model of Sec- 
tion 5.4. The PDE model we will define has been introduced in [AP96] 
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to reconstruct some basic properties of electrical wave propagation in car- 
diac tissue, including action potential shape, dispersion and restitution 
properties. Unlike the CODE model, the PDE does not model behaviour 
at the cellular level, although we might regard it as approximating bio- 
physical processes at a very abstract level. Compared with biophysically 
detailed models such as the CODE of Section 5.4, the computing re- 
sources required for simulating the behaviour of large volumes of muscle 
with simple models like this are modest, which is one reason for their pop- 
ularity. In fact, it will not be feasible to perform whole-heart simulations 
using cellular models for some time to come; recent computations using 
a supercomputer were limited to modelling the behaviour in a tiny two- 
dimensional portion of tissue [WKVN93]. Phenomenological PDE models 
such as the one defined here are commonly used for simulations of whole- 
heart behaviour [AP96,PH97]. 

In [HPT95,HPT98] the concepts of hierarchy for SC As discussed in 
this paper are extended to allow multi-level, hybrid models of cardiac ac- 
tivity to be defined, where different regions of tissue are modelled using 
different, interacting, SCAs. Modelling small regions of tissue with bio- 
physically derived, but computationally demanding CODEs embedded 
within a large-scale (e.g. whole-heart) phenomenological PDE system re- 
constructs both global wave behaviour (by the PDE), as well as local 
biophysics within the given regions (by the CODEs), and experimenta- 
tion with the model requires only modest computing power. 



A PDE model of wave propagation. The PDE model comprises two 
equations: 



dv 

Ik 

du 

dt 



—8v{v — 0.1)(r> — 1) — VU + e + 
e{v, u){—u — 8r>(r> — 1.1)). 



d'^v 

dx“^ 



The first equation approximates the tissue’s fast excitation processes, 
which we will identify with voltage. The second equation approximates 
slow processes which we refer to as tissue “recovery” . Here, t and x are 
time and space variables, and e is a time- and space-dependent variable 
that models electrical stimuli. The map e{v,u) = 0.002 -|-0.01'u/(r> -1-0.14) 
represents the excitability of the tissue. 

We define a finite difference approximation to this model as an SCA 
depicted: 
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that computes over the set IR, with numerical parameters At (time step) 
and Ax (space step). To distinguish clearly between the CODE and PDE 
models and their components, we denote the CODE SCA by Ni and the 
new PDE SCA by N 2 , and rename the components T, I, In, Ch and Out 
of Section 5.4 by Ti, Ii, Ini, Chi and Outi, and the observable state 
function Vout by Vi^out- We will also assume that Ni comprises a network 
of n = 4m modules. 

The model computes on data from the set IR with respect to a clock 
T 2 . Let l 2 = {1, . . . , m} index the modules and stimulation sources of the 
PDE model N 2 , and let Ch 2 = {{j,v), {j,u) \ j e I 2 } index its channels 
where v denotes voltage and u denotes recovery. Since we are interested 
in observing the voltages of the system, let Out 2 = {{j,v) \ j e I 2 } be 
the set of output channels. 

The module functions are determined by the finite difference method 
and the numerical parameters At and Ax. Let each (non-boundary) cell 
j = 2, . . . , m — 1 compute 

fj,v, fj,u : IR^ X 1R2 X IR ^ IR 

for channels (j, v) and (j, u) respectively, defined, for all cell states (a, b) e 
IR^, left and right neighbour’s voltages ai,ttr G IR, and input stimuli 
e e IR, by 



fj,v{d: b, 0,1, (Ir, e) 
b, 0 . 1 , 0 ,^, e) 



= a + At{—8a{a — 0.1)(a — 1) — ab + e) 

;[ai — 2a + ar) 



+ 



Ax"^ 



= b + At{e{a, b){—b — 8a{a — 1.1))). 



(The left- and rightmost cells j = l,m share module functions fj^^, fj^u ■ 
IR^ X IR X IR — IR defined using equations similar to the above, but 
dependent on a single nearest neighbour’s voltage.) 

For each (j,w) e Ch 2 we define a channel state function 

Vj^u, : T 2 X [T2 ^ 1R]^" X IR^^" ^ IR 

for all stimulation streams a G [T 2 — >■ IR]^^ and initial states s G IR*^^^, at 
time 0 by lj',«,(0, a, s) = s{j, w), and at time t + 1 for j = 2, . . . , m — 1, by 
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^j,W (t + Ij ctj s) f jjw(^j,V (^5 'S) ) ^j,U (^) 'S) ) 

(t, a, s) , (t, a, s) , aj (t)) 

and 



+ 1, a, 5 ) = fi,w{Vi^v{t, a, 5 ), a, 5 ), 5 ), ai(t)) 

^m,w (t H“ 1, a, s) — fm,w{Vm,v {t, a, s) , Vm,u {t, a, s) , (t, a, s), ajTi (t)) • 

Observable behaviour. We define the observable behaviour 

V 2 ,out ■ T2 X [T2 X ^ 

of the PDE, for all t G T 2 , a G [T 2 — )• IR]'^^ and s G IR*^^^, by 

V2,out(t, a, s) = a, s), R2,t;(t, a, s) . . . , a, s)). 

We illustrate the model’s observable behaviour using a simulation sim- 
ilar to that of the CODE model in Figure 5. Consider a system comprising 
m = 500 cells representing a 160mm strand of cardiac tissue as for the 
CODE model, thus assuming a PDE cell represents a 0.32mm sequence 
of 4 cardiac (and CODE) cells. Numerical parameter values At = 0.112 
and Ax = 1.57 give a stable solution to the PDE and a good approxima- 
tion to the action potential duration, restitution and wave propagation 
velocity of the CODE model, taking each clock cycle to represent 0.05ms 
or 5 CODE clock cycles. Figure 9 shows the voltages along the space at 
50ms (1000 clock cycle) intervals following an initial stimulus of value 0.2 
at the left-most cell for a period of 2ms or 40 clock cycles (achieved by 
taking ai{t) = 0.2 for t < 40 and aj{t) = 0 for all other j and t) for an 
initially resting system (s{j,w) = 0 for all {j,w) G Ch 2 ). 



Component abstraction. We compare the components of the PDE 
model N 2 with those of the CODE model iVi in order to compare their 
behaviours. 

Spaces. Each PDE cell represents four neighbouring CODE cells; this 
fact is formalised by a respacing ir : Ii ^ I 2 defined, for all CODE cells 
j G /i, by 7r(j) = [j/4] . Thus each PDE cell j G I 2 abstracts CODE cells 
= {4j - 3, 4j - 2, 4j - 1, 4j}. 
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Fig. 9. An action potential propagating from left-to-right along the PDE model 
following a single stimulus at the left-most cell. Voltage is plotted against space 
at 50ms intervals. 



Clocks. We define a linear retiming A : Ti — >■ T 2 by \{t) = [t/5j with the 
intention that the CODE clock Ti is five times faster than the PDE clock 
T 2 . 

Data. We now define the state abstraction map (f) : — >■ [C'h -2 ^ K] 

relating global states of the CODE model iVi with those of the PDE 
model N 2 . Notice that cj) maps onto partial PDE states since we cannot 
compare fully the global states of the two models. 

Values on each PDE voltage channel (j, v) abstract values on the volt- 
age channels of the set = {4j — 3, 4j — 2, 4j — 1, 4j} of four CODE 

cells abstracted by j. We use a simple averaging/scaling technique to 
map voltages, determined from the minimum and maximum voltage val- 
ues that ordinarily occur in each of the models. For the CODE model, 
a propagating action potential generates minimum and maximum volt- 
age values of -94.25 and 48.25 (mV) respectively; in the PDE model the 
normal minimum and maximum values are 0 and 1. (In some circum- 
stances, voltage values can fall outside these limits, by small amounts, in 
both models.) Using these values we define 0, for all global CODE states 
s e and PDE voltage channels (j,v) G Ch 2 by 

4’{s){j,v) = {average{s{i,v) \ i G vr“^(j)) -|- 94.25)/142.5 

= {average{s{4j — 3,v), . . . , s(4j, u)) -|- 94.25)/142.5. 




Hierarchies of Spatially Extended Systems and Synchronous Concurrent Algorithms 229 



The second equation in the PDE model N 2 represents slow processes, 
which includes everything in the CODE model with the exception of 
voltage. Due to the biological detail of the CODE model iVi, expressed 
largely by the computation of each CODE cell’s 16 local non-voltage 
states, it is pointless to attempt to completely formulate an abstraction 
map. Almost always, values on each PDE recovery channel (j, u) will not 
approximate the state of the set x dyn of 64 channels, even for 

fairly weak notions of approximation. By Section 6.1 however, there must 
exist at least one CODE state s G for which 4>{s) G is totally 

defined. The most useful state for comparing the two models’ behaviours 
is one corresponding to uniformly resting (or completely recovered) tissue 
(such a state was used as the initial state in the simulation illustrated in 
Figure 5). Let us define a state s G IR^ of the subspace vr“^(j) to 
be at rest with respect to non- voltage channels if, for all i G vr“^(j) and 
w G dyn, s{i,w) = restw where restw G IR is termed the resting value 
for tp-channels. Resting tissue in the PDE model N 2 is represented by 
the value 0. We therefore define (j), for all A^i states s G IR^^’^ and N 2 
recovery channels {j,u) G Ch 2 , by 

\ _ / 0 if s{i,w) = restu) for all i G vr“^(j) and w G dyn 
^ It otherwise. 

The map 4> is obviously space consistent, and since its observable (voltage) 
part (j)obs ■■ ^ maps only onto totally defined observable 

states of N 2 , we can define a notion of observable approximation between 
the two models. 

Input streams. Finally, we define a function 0 : [Ti — >■ IR]^’^ — >■ [T 2 — >■ 
IR]^^ that compares global streams of stimuli between the two models. 
We define 0 to be consistent with A, and with vr viewed as a source 
abstraction map: for each coordinate &j : [Ti — >■ IR]^’^ — >■ [T 2 — >■ IR] of 0 
(for j G I 2 ) the value 0j{a){t) G IR abstracted for source j at time t G T 2 
from a set a G [Ti IR]^’^ of CODE input streams should depend only 
on the 20 values supplied at times A“^(t) = {5t, . . . , 5t -|- 4} by the set 
7T“^(j) = {4j — 3, ... , 4j} of sources. 

It is difficult to meaningfully define each 0j as a total function, and not 
really necessary to compare the models: stimuli streams are a somewhat 
artificial mechanism for initiating action potentials in models of tissue not 
integrated into the whole-heart (in the heart, a special region of tissue, the 
sino-atrial node, generates action potentials which propagate throughout 
the muscle). We define the map 0 to be partial with non-empty domain 
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dom{0). The map 0 is defined where, for all j e I 2 and t e T 2 , the above 
20 values are (i) all 0, representing an absence of stimulation; or (ii) all 
—30, representing a uniform stimulation of 30nA suitable for generating 
(at least for resting tissue) an action potential. Corresponding suitable 
values in the PDE model N 2 are 0 and 0.2. The domain of 0 is given by 

dom{0) = {a e [Ti — >■ IR]^^ | 

(Vj e l 2 ){yt e T 2 )(Vi e 7r“^(j))(Vc e A“^(t)) 

[ai{c) = 0 or ai{c) = —30]} 

and we define each 0j, for all a e dom{0) and t e T 2 , by 

^ ( \(+\ — / ^ ®i(c) = 0 for all i e and c e 

" ^ ^ \ 0.2 if tti(c) = —30 for all i e and c e 

Approximation of observable behaviour. We now define one notion 
in which the observable behaviour of the PDE model N 2 can be said to 
approximate that of the CODE model iVi. 

Let d : x — >■ 1R“*“ be a metric that compares two global 

observable states of the PDE model, and is defined, for all si, S 2 G 
by 

d{si,S2)= I ^i(j» - ^2(j» I 

(j,v)eOut2 

such that d(si, S 2 ) is the sum of the differences of cell voltages across the 
space. 

Due to the biological complexity of the CODE model, there are some 
streams, initial states and times for which its observable behaviour will 
not be approximated by the PDE model, even for reasonably large tol- 
erances. For example, the PDE model does not accurately reproduce the 
vulnerability properties of the CODE model, which deal with the effects 
of stimulating tissue a short distance behind a travelling action poten- 
tial. Depending on the strength and position of the stimulus, either (i) 
no new action potential is initiated, (ii) a single action potential is gen- 
erated which travels in the opposite direction from the original wave; or 
(iii) two action potentials are generated, one travelling in either direction. 
An input stream that gives rise to one of these cases in the CODE model 
may (when abstracted) result in a different case for the PDE model. If so, 
after a short time period, the action potential positions will not match 
between the two models, giving large values of d. 

However, for many cases the observable behaviour of the PDE model 
will approximate that of the CODE model. A simple example is given by 
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the values used in the simulations of Figures 5 and 9. This does not yield 
a mathematically provable theorem, but we do obtain, from experimen- 
tation, a useful result. 

Experimental result. There exist subsets sub{Ti) C Start\, sub{[Ti — >■ 
IR]^^) C [Ti — )■ lR]^i and sub(\R^''^^) C domp{4>) useful for practical mod- 
elling purposes, such that for all clock cycles t C sub{Ti), stream sets 
a e sub{[Ti — >■ lR]^i) and initial states s C ), 

^{,^ohs (Rl,OUt(^5 ^5 '^))5 (^(^) 5 ^ 

for an acceptably small value of e G IR"*". A fact important for practical 
purposes, is that when all the subsets are finite, approximation can be 
exhaustively tested for. 

7 Concluding Remarks 

Our conception of a theory of hardware is very broad. In particular, it 
includes within its scope applications to the analysis of digital hardware, 
and of physical and biological systems. For example, we would wish to 
include in the desiderata for a theory the capability of integrating digital 
and biological systems. This integration is needed in applications where 
an implantable cardiac device is coupled to cardiac tissue, forming a single 
system. More ambitiously perhaps, we would wish the theory to account 
for similarities and differences between hardware, firmware and wetware. 
We are far from such a comprehensive theory at present. 

We posed, in the Introduction, the Integrative Hierarchy Problem for 
a theory of hardware which requires a mathematical theory that can relate 
and integrate a range of mathematical models of systems at different levels 
of abstraction. This means that we need formal concepts that embrace 
models of hardware, physical and biological systems and that explain how 
to compare, couple, and partially substitute such models. 

The aim of this paper is to pose this general hierarchy problem and to 
report on some initial progress of our theoretical analysis of its solution. 

In the paper we have introduced a simple definition of spatially ex- 
tended system which clearly embraces a broad range of computing, phys- 
ical and biological models. We have used the general theory of data types 
to show that computable systems, and computable systems that approx- 
imate continuous systems, can be defined very simply by small finite sets 
of equations. Thus we have established a prima facie case that the alge- 
braic theory of data and equational specifications can successfully attack 
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the general problem. We also gave some simple definitions of hierarchical 
structure between two SESs. 

In the second half of the paper we studied synchronous concurrent 
algorithms (SCAs). We showed that SCAs are indeed SESs, and that 
they model both computer hardware and biological systems. We applied 
the hierarchical notions about SESs to derive appropriate hierarchical 
notions for SCAs. These notions were then used to hierarchically analyse 
algorithms for signal processing and cardiac tissue. 

Our intention is to make a conceptual contribution to the theory of 
hardware. Within the confines of our algebraic approach, we have concen- 
trated on the narrow task of exploring the essential scientific structure of 
the Integrative Hierarchy Problem, while emphasising the full generality 
of its application. This theoretical analysis is far from over, of course. For 
example, there are many problems hidden in case studies, such as the hi- 
erarchical relationship between discrete digital and continuous electrical 
models of chips, and the coupling of digital and biological systems. 

Outside the algebraic approach, there are questions concerning the 
relationship between our models and the various theories of hybrid control 
systems (see the various approaches in [GNRR93], for example). This 
field of applications is blessed with a landscape of models and logics for 
reasoning about the behaviour and control of timed state systems, timed 
automata, and abstract machines. 

Furthermore, work is needed on questions concerning the hierarchy, 
including the refinement, parameterisation, modularity and scalability of 
our models. These are basic concerns underlying hardware description 
languages, of course. 

On a practical front the SCA approach has been used in the mod- 
elling and specification of examples of increasing size in both hardware 
and biological systems. For example, the methods for the study of mi- 
croprocessors, started in [HT93], has been applied to complex examples 
in [Fox98] and [FH99], and to the JAVA virtual machine in [Ste99]. The 
methods for the study of biological excitable media have been used in our 
programme on heart simulation [HPT95,HPT96a]. 
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Abstract. We develop an algebraic specification of the architecture of 
an abstract and simplified version of the Java Virtual Machine (JVM). 
This concentration on the implementation-independent features of the 
machine allows us to build a clean and easily comprehensible model in 
which its structure is emphasised. We then axiomatise the semantics of 
programs operating on this architecture. We also consider how we can 
concretise this abstract model which provides us with a firm foundation 
for exploring the entire JVM and thus of analysing the correctness of 
Java implementations. 



1 Introduction 

Virtual machines are software emulations of physical, or physically real- 
isable, machines; they act as “synthetic computers” (Liu [1996]). Virtual 
machines are used to describe and standardise the behaviour of a variety 
of applications across a range of platforms and so must abstract from 
architectural-dependent details. 

For example, the operational semantics of programming languages, 
from the SECD machine (Landin [1964]) through to the abstract rewrit- 
ing machine (Kamperman and Walters [1993]), explain the behaviour of 
programs in terms of their effect on abstract or virtual machines. This 
idea has been extended and applied to the implementation of a number of 
languages, for example, the Warren Abstract Machine for Prolog (Warren 
[1983]), the Java Virtual Machine for Java (Gosling [1995], Lindholm and 
Yellin [1997]). 

Related to this implementation theme, the universal intermediate lan- 
guage UNCOL (Strong et al. [1958]) was envisaged as a general intermedi- 
ate language for abstract machines. This goal has been realised for partic- 
ular compiler front-ends, with, for example, P-Code for Pascal (described 
in Nori et al. [1981]) and the Register Transfer Language (Davidson and 
Fraser [1984]) which has been used for a number of languages. 



B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 236-277, 1998. 
© Springer- Verlag Berlin Heidelberg 1998 




Towards an Algebraic Specification of the Java Virtual Machine 



237 



Other application areas revolve around that of operating systems. The 
IBM VM operating system series (for example, the VM/370 (Seawright 
and MacKinnen [1979])) and Microsoft Windows 95 (King [1994]) both 
implement virtual machines to facilitate compatibility between product 
versions. Virtual machines are now also being used to enable executable 
code to be emulated on different platforms, for example, MS-DOS based 
products running on INTEL processors. 

Recently, there has been renewed interest in virtual machines through 
their use in implementing the Java programming language: it is the abil- 
ity of Java applets to deliver code across the internet which will execute 
on different platforms, that has driven this interest in employing an ar- 
chitecturally neutral model of execution in the form of the Java Virtual 
Machine (JVM). We take the JVM as a case study for the application of 
general techniques to the semantic modelling of deterministic machines. 
We show that these methods apply smoothly to virtual machines, help- 
ing to close the gap between semantic models of programs, systems and 
hardware. 

Specifically, in this paper, we take the methods of algebraically mod- 
elling microprocessors described in Harman and Tucker [1996, 1997] and 
Fox and Harman [1998], and 

(i) axiomatise the modelling process to yield an algebraic specification 
framework in Section 2 for defining the semantics of machines; 

(ii) apply these techniques to the algebraic specification of the architecture 
(Section 3) and semantics (Sections 4) of an abstract (and simplified) 
version of the JVM; and 

in) explain how we can concretise our abstract JVM model to provide a 
specification of the JVM in Section 5. 

We axiomatise the semantics of the JVM by describing how the system 
evolves over time. We iterate a next-state function on a specification of an 
algebraic model of the JVM and enumerate with a clock the sequence of 
states produced. We specify this iteration by means of primitive recursive 
equations. 

One of our major concerns is how we can manage the scale of this large 
example (the concrete JVM has 201 instructions). We first produce a more 
abstract model of the JVM in Section 3, by removing implementation de- 
pendent features. Then we build a specification of the architecture of this 
abstract JVM by linking together and instantiating generic specifications 
of abstract data types that describe commonly occurring structures. 

These abstractions percolate through to the instruction set, reducing 
its size to 20 instructions (or rather families of instructions which are 
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indexed by the sort set of the underlying abstract data type that we com- 
pute over). We define the next-state function on each of these instructions 
in Section 4, to produce an axiomatisation of the semantics of the JVM. 

Our final task in Section 5 is to relate our abstract model to that of 
the JVM. 

One aim of this work is to continue this modelling process upwards 
(to Java) and downwards (towards a particular platform). Such models 
can be integrated within a common framework testing for trusted compi- 
lation. This would allow us to trace the progress of the execution of Java 
programs from their conception in software to their implementation in 
hardware. 

The reader is assumed to have some familiarity with algebraic specifi- 
cations (Meinke and Tucker [1992]) and the Java Virtual Machine (Lind- 
holm and Yellin [1997]). 

Related Work There have been a number of approaches to specifying 
the JVM. 

In Bdrger and Schulte [1998], the JVM and its instructions are sub- 
divided into incremental sets, and their semantics are modelled using 
abstract state machines. In addition, compilers from subsets of Java to 
these languages are constructed. 

Hartel et al. [1998] produce an executable specification of the seman- 
tics of the Java Secure Processor; this is essentially a modified subset of 
the JVM designed to be sufficiently small to fit onto a smart card. 

Pusch [1998] also describes an executable specification of the JVM. 
An abstract model of the JVM is produced, although she does not exploit 
the abstractions to as full as an extent as we do. A principle feature of our 
work is handling the issue of scale; we employ the principles of abstraction 
and modularisation wherever possible. 

Some of the literature on the semantics of the JVM is more biased 
towards type-checking. For example, Qian [1998] is concerned with pro- 
ducing a static type inference system, and Cohen [1997] uses run-time 
checks to ensure type-correctness. Much work has focused on typing con- 
straints; see for example, Goldberg [1998], Freund and Mitchell [1998] and 
Stata and Abadi [1998]. This is an area which we do not concentrate on 
(to enable the dynamic semantics to be viewed with greater clarity); we 
presume that the instructions have already been type-checked. However, 
a thorough investigation could be made using the work in this paper as 
a foundation. 
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2 Modelling and Specification Preliminaries 

In this section we describe the algebraic specification framework that we 
use to define the semantics of machines. 

2.1 Machine Semantics 

We shall define the behaviour of a machine in terms of how the system 
evolves from one state to another over time. Thus, we consider the exe- 
cution of a machine from an initial state tq to produce a finite 

T0,Tl,. . . ,Tt 

or infinite sequence 

T0,Tl, . . . . . . 

of machine states. We shall encode finite sequences . . . ,Tt as infi- 

nite ones To,T\, ... ,Tt, *,*,■■ ., where * is a distinguished machine state 
specifically introduced for this purpose. 



Time We use a clock Time to enumerate our state sequences. As we 
use a discrete clock to record events, we can specify the generation of the 
ticks of the clock with: 



specification 


TIME 


import 




sorts 


time 


constants 


Zero : — >■ time 


operations 


Succ : time — >■ time 


equations 





Next-State Function To define the behaviour of machines, we intro- 
duce a function 



Next : machine^state — >■ machine^state 

such that Next{Tt) gives the next state Ti+i that results from executing 
the machine on the state Tt. 

Thus, we can define machine semantics using the iterated map Next*, 
which we specify by: 





240 K. Stephenson 



specification MACHINE ^SEMANTICS 
import MACHINE .STATE, TIME 

sorts 

constants 

operations Next : machine.state — >■ machine.state 

Sem : machine.state x time — >■ machine.state 



equations 



Sem{r, Zero) = r 

Sem{T, Succ{t)) = Sem{Next{T),t) 



2.2 Machine States 

Thusfar in our model of semantics, we have just assumed that we have 
some specification MACHINE .STATE of the set of states of the machine 
which includes the distinguished state *. We now extend the modelling 
process as far as we can whilst striving to maintain generality. 

Programmable Machine States We want to consider how a machine 
behaves in response to the execution of programs. 

First, we observe that the distinction between programs and states is 
somewhat blurred; the program is stored within the state, and certain as- 
pects of the state are typically determined by the program to be executed. 
However, it is useful to be able to separate out these concerns as distinct 
entities so that we do not consider the program as being hardwired into 
the state. 

Let PROG be a specification of the set of programs for the machine 
(for example, as described later in this section) and STATE that of the set 
of all states of the machine without the program component (for example, 
the von Neumann general architecture specification given in Section 2.3). 
We suppose that we have a function 

Install : prog x state — >■ machine.state 

such that Install{P, a) gives the state of the machine which results from 
loading the program P into memory for execution on the state whose 
initial values are determined by a. 

We can now specify the concept of a programmable machine state: 
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specification 


MACHINE .STATE 


import 


PROG, STATE 


sorts 


machine.state 


constants 


* : — >■ machinestate 


operations 


Install : prog x state — >• machine.state 
^prog . machinestate — >■ prog 


equations 


(^Install {P, a)) = P 



Note that projecting out the state component of a machine state 
InstaU{P,a) will not necessarily yield a. (The construction of machine 
states is not, in general, that of forming a Cartesian product, which we 
specify in Section 2.3.) 

General Architecture Model In order to add further structure to our 
model, we have to make certain assumptions about the architecture of 
the machines that we want to perform this process for. We shall take the 
class of von Neumann machines, whose architecture follows the classical 
structure illustrated in Figure 1, (although note that we consider the 
program to be a separate entity from that of the memory). 




Fig. 1. Von Neumann architecture. The dotted arrows indicate the flow of data between 
components. 



We produce an algebra State below that models the architecture of 
von Neumann machines; we will give a succinct specification STATE of 
this model in Section 2.3 when we have introduced appropriate generic 
specifications that make this task more manageable. 
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algebra 


State 


import 


Memory, CU , ALU 


carriers 


State = Memory x CU x ALU 


constants 

operations 


^Memory . Memory X State — >■ State 
5^^ : CU X State — >■ State 

§alu . aLU X State — >■ State 


definitions 


^Memory (M, C , A)) = {M' , C, A) 




5^^{C’, (M, C, A)) = (M, C’, A) 
SALu^A', (M, C, A)) = (M, C, A') 



where Memory, CU and ALU are algebraic models of the memory, con- 
trol unit, and arithmetic and logic unit. We can change these components 
of State with the operations of and 5^^^ . This will allow 

us to describe how the machine behaves in response to the execution of 
a program. 

Programs We now turn our attention to the program component of 
machine states. 

By application of a result of Bergstra and Tucker [1987], we know that 
it is possible to algebraically specify the programs of any programming 
language which has a computable syntax. In practice, we can achieve 
this effect by first specifying an appropriate context-free superset of the 
language that we require using a technique developed in Rus [1971] and 
independently in Goguen et al. [1977]. This method describes how we can 
generate a closed term algebra T{E^) from the context-free grammar G, 
such that T{E^) = L(G). We can then filter this superset (van Deursen et 
al. [1996], Rees et al. [1998]) to produce the (non-context-free) language 
L C L{G) that we require. 

Example In Section 4, we shall model the semantics of an instruction set 
for an abstract version of the Java Virtual Machine, which includes: 

<instrn> ::= . . . j Return^oi^ j Goto <instrn_index> j . . . 

We model these instructions algebraically with the constant 

Return„oi(i :— >■ instrn 



and the function 



Goto : instrn Jndex — >■ instrn. 
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In addition though, we have to specify the non-context-free constraints 
imposed on JVM programs, such as requiring that Goto instructions refer 
to some instruction of the program. 

2.3 Specification Framework 

In order to manage the complexity of our models we need to be able 
to describe the architecture of machines at different levels of abstraction. 
Furthermore, to manage the scale of the models produced, we shall impose 
a modular structure on the design; we model the interconnections between 
components by using simple parameterising mechanisms (flattening) that 
allow specifications to be instantiated with other specifications. 

In this section we specify the structures that we shall find useful for 
describing the architecture of machines at a high level of abstraction. 

Cartesian Products Typically, we can model machine states as cer- 
tain Cartesian products CP{A\, . . . ,An) of sub-components A\, . . . , A^, 
for example, we can specify the general von Neumann architecture of 
Section 2.2 by 



STATE = CP {MEMORY, CU , ALU) 



where MEMORY, CU and ALU are specifications of the memory, control 
unit and ALU components. 

We specify the general construction by: 



specification 


CP{Ai,...,An) 




import 


Ai,..., An, INDEX n 




sorts 


A 




constants 

operations 


V : AiX ■■■ X An 


A 




. . . , 7T* : A ^ Ai, . . . 

. . . ,N : A{ X A — ^ A, . 




equations 


. . . , 7T {v{Qj\ , . . . , Cf-rt)) — (^i:i • • • 


. . . , Equals{i, j) = True 


II 


... ,Equals{i,j) = False 


T 

II 



(To make the presentation more concise, we have used a specification 
INDEX n of the indexing set {1, . . . , n}, which contains a test for equality.) 
Thus, given specifications A\, . . . , An, we can 
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(i) form tuples of elements; 

(ii) project out the component elements; and 

(iii) change the values of any tuple component. 

Lists One of the simplest structuring mechanisms is that of forming a 
list of elements from a given data type; the only complication that arises 
is that we want to be able to have lists that are composed of elements 
that are of different sorts from an arbitrary specification A with sort set 
S (which for ease of notation, we denote by S = ...}): 



specification 


LIST (A) 


import 


A, 


sorts 


list{S) 


constants 


Empty : — >■ list{S) 


operations 

equations 


. . . , Lsts : s X list{S) — >■ list(S), . . . 



Stacks A central structure of the JVM is the stack; we specify the generic 
stack structure by: 



specification STACK (A) 


import 


A 


sorts 


. . . , stacks, ■ ■ ■ 


constants 


• • • 1 ^ stack^underflow: • • • 

. . . , Empty Stacks ■ stacks, ■ ■ ■ 


operations 


. . . , UndcT floWs ■ ^ S stack ^under flow, ■ ■ ■ 

. . . , Pushs : s X stacks stacks, ■ ■ ■ 


equations 


. . . , Pops : stacks stacks, ■ ■ ■ 

. . . , ToPs : stacks ^ S stack_under flow , ■ ■ ■ 

' ' ' , I's ' ^ ^ stack^under flow: • • • 

. ,T op s{Empty Stacks) = Underflows, ■ ■ ■ 




. . . ,Tops{Pushs{d,S)) = is{d), . . . 

. , Pops {Empty Stacks) = Empty Stacks, ■ ■ ■ 




. . . , Pops {Pushs {d,S)) = S, . . . 



Note that later we shall consider how we can concretise this structure 
to provide a model of the concrete JVM. In particular, at some point 
in this process we shall merge the family of S'-sorted stacks into a single 
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stack. As a consequence, we shall need to know the order in which we take 
elements of different sorts from the stack. Thus, in our later descriptions 
of the semantics of the abstract model of the JVM, where such consider- 
ations will affect the concrete model, we indicate the relative order of the 
elements on the stack. 

Tables We shall use tables to store arbitrary data types which allow 
direct access to the data through an indexing mechanism. We impose the 
restriction that the specification INDEX of the indexing scheme comes 
complete with a specification of the Booleans, and an equality function 

Equalsindex '■ index x index — >■ bool 



on indices. 



specification TABLE (A, INDEX) 


import 


A, INDEX, 


sorts 


• • • 5 ^uninitialised^ • • • 

. . . , tables^ . . . 


constants 


. . . , Empty s : — tables , • • • 

. . . , Ufliflitialiseds . ^ ^uninitialised'^ • • • 


operations 


. . . , Reads • iadex x tables ^ ^uninitialised-^ • • • 
. . . , Stores ‘ s X index x tables — ^ tables ^ • • • 

• • • t l^s ' ^ ^ ^uninitialised^ • • • 


equations 


. . . , Reads{i^ Empty s) = Uninitialiseds^ • • • 




. , Equalsindex (b j) = true => 
Reads(i, Stores(d, j, T)) = is(d), . . . 




. , Equalsindex(i,j) = false => 

Reads(i, Stores(d, j, T)) = Reads(i, T), . . . 



Error-Handling The generic specifications of stacks and tables given 
above can both return error elements (Underflows and Uninitialiseds, 
respectively) in certain circumstances. As these specifications will form 
the basis of the architectural components of the JVM, these error ele- 
ments will percolate through to most aspects of the model of the JVM. 
(For example, the JVM is a stack-based machine, so the execution of the 
majority of the instructions can potentially create errors.) 
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We would like our specifications to be strict in their error-handling, 
but we would like to deal with the propagation of these errors in a man- 
ageable fashion. 

We could introduce equations to specifically propagate the errors 
through the specifications, or alternatively, we could introduce some alge- 
braic machinery (for example, Goguen and Meseguer [1992] or Haveraaen 
and Wagner [1995]) that would automatically deal with the errors. In or- 
der to avoid the problems with the explosion of equations that result from 
the first route, and the introduction of an additional algebraic overhead of 
the second, we shall adopt a pragmatic view; henceforth, we deliberately 
omit the additional equations that would be required to give a complete 
specification. 

To illustrate the conventions that we adopt, consider the definition of 
the instruction 

Dup^ :— >• instrn 

from Section 4.1, that duplicates the top value of the operand stack by: 
|DupJ(cr) = A^i{Loada{Fetcha{a),a)) 

The function 

Fctcha . jVTTl-StdtC ^ ^atack-underflow 

takes the top element of the operand stack, and so can produce an error. 
The functions Loada (that places an element onto the top of the element 
stack) and (that increments the program counter) however, do not 

produce errors when they are applied, except when they simply propagate 
errors. Thus, we give the type of these functions as 

Loada '■ s X jvmstate — >■ jvmstate 



and 



aPC+i 

: jvmstate — >■ jvm.state 



so withholding the error propagation typing information: 



Loada • ^ atack -Under f low ^ jVTTlState ^ j^Pkl-l^tatCatack-under flow 

A : j VTfl state atack -Under flow ^ jvfkl -StatCatack-under flow 

In addition, we suppress the error-propagating conditional equations: 
EqualSa,,,,^_^„^,,j:,,^iFetcha{a),iaiUnderflow)) = True 

|DupJ (cr) = ijvmstate{Underflow) 

Equals a t „ {Fetcha{(j)sa{Lfnder flow)) = False 

^ |DupJ(cr) = A^^{Loada{Fetcha{a),a)) 
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It should be emphasised that we do not wish to trivialise the er- 
ror cases which are important and informative. On restoring the missing 
error-propagating information to our specifications, we would find that 
errors would not arise and propagate had we subjected the instructions 
to some suitable prior analysis (such as that provided in practice by the 
JVM bytecode verifier). The convention we adopt allows us to treat the 
dynamic semantics of the JVM with greater clarity. 

Filtering As we indicated in Section 2.2, we typically need to impose 
additional constraints on a specification to define the set of well-formed 
programs of a language. We shall perform this task by adding a filtering 
step to the specifications we have introduced. Such filters allow us to 
define the typically occurring constraints of 

distinctness: that elements of a list be distinct from each other; 
disjointness: that two lists are disjoint from each other; 
completeness: that every element of one list is present in another. 

For an axiomatisation of these filters, see Stephenson [1996], Rees et al. 
[1998]. In this paper, we shall simply indicate that we need to apply a filter 
by prefixing a specification name with “F”, as in FCP^ FTABLE, etc., 
and listing which of the properties given above that the filter captures. 

3 An Abstract Model of the JVM 

To axiomatise the semantics of JVM programs (Java bytecodes), we first 
build a high-level specification of the JVM in this section which abstracts 
away any features that could be classed as implementation-dependent. On 
this architecture that we produce, we describe the semantics (in Section 4) 
of a set of abstract instructions. 

We start in Section 3.1 by presenting an overview of the JVM, concen- 
trating in particular on its architecture. Then in Section 3.2, we describe 
the abstractions that we have employed in this model, and in Section 3.3, 
specify the architecture of the abstract machine. Finally, we specify its in- 
struction set in Section 3.4; we devote the whole of Section 4 to specifying 
the semantics of these instructions. 

3.1 Overview of the JVM 

The JVM displays characteristics typical of both high- and low-level vir- 
tual machines. This hybridisation is intentional, in that the JVM has been 
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designed to be at a sufficiently low level of abstraction so that it can be 
efficiently executed (by interpretation, compilation or direct execution), 
whilst being at a sufficiently high level to enable it to be architecturally 
neutral. 

These considerations are reflected in many aspects of the JVM’s de- 
sign. For instance, the JVM, like many abstract machines for high-level 
languages, but unlike many abstract machines for low-level languages 
(and many traditional hardware implementations), is a stack-based ma- 
chine. 

Our concern in this paper is that of the essential structure of the 
JVM: to perform this analysis, we model an abstracted version of the 
JVM which pays no attention to any implementation-dependent features. 
We can then reinstate these features to produce a model of the concrete 
JVM. 

Architecture We illustrate the architecture of the abstract JVM in 
Figure 2 (see Section 3.2 for a description of how this model abstracts 
from the actual JVM). 

First though, we outline the principle components of the JVM and 
describe their purpose. The JVM has all the characteristics of the classical 
von Neumann architecture discussed in Section 2.2: 

Programs The programs are stored in the Method Area. A program for the 
JVM is low-level in the sense that programs can only be constructed by 
the sequential composition of instructions. However, the code is designed 
to support object-oriented structuring: the code is split into classes, each 
of which defines a set of methods that can be applied to instantiations of 
a class. 

Memory The JVM’s memory is termed the Heap. Here the class instan- 
tiations are stored. Note that the values of the variables that constitute 
an instantiation can be shared either amongst all the instantiations of a 
class (static fields) or are tied to particular instances of a class (non-static 
fields). 

ALU The Operand Stack and the Local Variables of the JVM together 
constitute its ALU. The operand stack is used as a temporary storage 
area to calculate the value (if any) that a method computes. The local 
variables store the parameters that are passed to methods, together with 
variables that are local to a method. 
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Control Unit The control unit of the JVM is split between two areas: 

(i) the Registers, which in our abstract version of the JVM simply consists 
of the Program Counter that determines the execution order of the 
instructions of the current method; and 
{a) the Execution Environment, which determines the method that is cur- 
rently executing, together with the point at which this method will 
return to within the calling method when it completes. 

Structure Structurally, the JVM stores the ALU components (the Oper- 
and Stack and Local Variables) and the Execution Environment of the 
control unit together as a Frame. Thus, a frame stores all the transitory 
information needed during the execution of a single method. 

To enable methods to call each other arbitrarily, frames are stacked 
together within the Frames area of the JVM; the frame which is at the 
top of this stack is the one which is currently executing, and the frame of 
the method which initiated it is stored directly underneath, and so forth. 

We record the current method’s point of execution within the Reg- 
isters area. (Note that in the concrete JVM, there may also be registers 
to control aspects of the Frames; see Section 3.2.) Together, the Frames 
and Registers constitute a Thread of the JVM. We maintain this thread 
structure, even though we adopt a simplified single-threaded model, so as 
to provide a firm foundation for extending this work to consider multiple 
threads. 

The components of Thread, Heap and Method Area together consti- 
tute the State of the JVM. 

3.2 Abstraction 

Our high-level view of the JVM exhibits three types of abstraction: that 
of data, structure and efficiency. 

Data Abstraction We shall produce a model of the JVM which operates 
over some many-sorted V-abstract data type A. Note that in order to 
model the flow-of-control instructions that the JVM has, we shall presume 
that A specifies a Boolean- and Naturals-standard algebra. 

We shall assume that we have a specification for the abstract data 
type A that 

(i) imports specifications INSTRN _INDEX and OBJECT _INDEX for 
indexing sets which have sort sets instrnJndex and object-index, 
respectively; 
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(a) for each sort s ^ S, determines the equality function 

Equal Ss : s x s — >■ bool 

on elements; 

{Hi) for each sort s ^ S and word w ^ S* with / 0, defines the 

operations 

Apply yj^s : fuuyj^s x list{S) -)• terror 

on the set of functions of the data type A, so that Applyy,^s{f, L) 
returns the result of applying the function / to the arguments given 
in the list L if this is all well-typed, and otherwise returns an error; 
and 

(iv) for each sort s e S', defines a default value 

DefaultValuCs :— >■ s 
associated with the sort s. 

In order to incorporate the syntax of A into the instructions, we shall 
assume that we have a specification 

SIGNATURE{E) 

of the signature E of A, (for example, Rees et al. [1998]). 

In addition, we shall also abstract away completely from how this data 
is represented. We shall simply consider that the abstract JVM is able to 
store, transfer and manipulate values of the data type A. 

Structure Abstraction We shall abstract away from how we implement 
the different data structures that we use to store the components of the 
JVM. We can split this idea into three different applications. 

Firstly, the internal stack of the concrete JVM is commonly imple- 
mented as an array of values, with a pointer (stored in a register) to 
the value which is the current top of the stack. We shall remove this 
implementation-dependent feature, and simply specify stacks using the 
STACK data type of Section 2.3. 

Secondly, the internal stack of the concrete JVM is used to store 
three different types of information (the operand values, the execution 
environment and the local variables). In order to differentiate between 
these elements, the concrete JVM is commonly implemented using two 
registers. In our abstract model though, we shall simply consider that 
these are different structures which we can project out of the state. 
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Thus, our abstract JVM has just one register (PC) which controls 
the execution order of the instructions, and the execution environment 
does not have to perform the role of maintaining the internal stack that 
it typically does in implementations of the JVM. 

Thirdly, we shall employ the TABLE data structure of Section 2.3 
to allow access to values through some location mechanism. We shall 
consider that we have such indices (for example, the names of variables 
or constants) by which we can access the locations in which we store these 
values. 

Efficiency Abstraction As the JVM is a practical model of computa- 
tion, it has efficient versions of the most commonly deployed instructions 
(in addition to the .quick variants, which we do not consider in this pa- 
per). For example, the most basic type of load instructions require the 
location of the local variable to be specified; the more efficient versions 
of this instruction are specific to individual locations, so eliminating the 
need to store and retrieve this information. 

As we are interested in modelling functionality rather than efficiency, 
these efficient versions of instructions have no part to play in our abstract 
model, (and indeed, our architectural abstractions prevent us from being 
able to consider such instructions). Later though in Section 5, we shall 
describe how we can model the effect of these efficient versions on the JVM 
when we have a more concrete model of the JVM (which uses positions 
rather than names to locate values). 

3.3 Architectural Specification 

We need to construct a specification for the architecture of our abstract 
JVM illustrated in Figure 2. In fact, this is now a straightforward task 
given our generic specification structures of Section 2 and the abstractions 
listed in Section 3.2. We illustrate the architecture of our specification for 
the structure of the abstract JVM in Figure 3. 

Notation For convenience, we introduce projection and alteration func- 
tions that operate directly on each aspect of the JVM. For example, we 
define the operations 

nPG . jvm_state — >■ instrnJndex 

: instrnAndex x jvmstate — >■ jvmstate 

to allow us to access and change respectively, the value of the program 
counter by: 




Fig. The ^ification architecture of our abstract JVM. (For t)cgraphical reasons, we have omitted to indicate the reliance of a 
^ification on the underlying abstract data tjqs A or its signature E. 
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n^^{a) = T^PCi^^Reg^stersl^^Threadf^^^^^^-^ 

a) = SThread^^Registers^^PC^^^ JjRegisters ^ jjThread ^ 

Thus, the function works by repeatedly applying projection func- 

tions from the Cartesian product specification of Section 2.3, as each of 
the elements of State, Thread and Registers are Cartesian products (see 
Figure 3). The function is similarly defined (although slightly more 
involved), using only Cartesian product operations. 

For the program counter register, we also introduce a further function 

A_^i : jvmstate — >■ jvmstate 

for conciseness which increments the value of the program counter by one: 
(a) = A^^ {Plus{n^^ {a),Succ{Zero)),a) 

3.4 Abstract Instructions 

We can break down the instructions of our abstract JVM into six cate- 
gories of related instructions. Each instruction (or rather, each family of 
instructions in most cases) follows the abstraction principles laid out in 
Section 3.2: 

(i) operations on the underlying data type (Eval^u^s); 

{a) manipulating values on the operand stack (Dup^, Pop^ and Swap^); 
(iii) transferring values between the local variables and the operand stack 
(Loads and Store^); 

{iv) loading constants from the constant pool (ConPlLoads); 

{v) operations on objects (New, GetFields, GetStatiCs, PutFields and 
PutStatiCs); and 

(vi) flow-of-control (Goto, Jsr, Nop, Case^, Cond^^,^s, Invoke, Returns and 
Return„oi(i). 



The Syntax of the Instruction Set Using the method outlined in 
Section 2.2, we produce a specification for the instruction set: 
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specification 


INSTRN{S) 




import 


SIGNATURE (E), CONST. NAME, CL ASS. NAME, 
FIELD. NAME, METHOD. NAME, VAR.NAME, 




INSTRN. INDEX 


sorts 


instrn 




constants 


Nop : 


— >• instrn 




New : 


— >• instrn 




Return^o^d • 


— >■ instrn 




. . . , Return^ : 


— >■ instrn, . . . 




. . . , Dup^ : 


— >• instrn, . . . 




...,Pop^ : 


— >• instrn, . . . 




. . . , Swap^^^, : 


— >■ instrn, . . . 


operations 


Goto : 


instrn.index — >• instrn 




Jsr : 


instrn.index — >• instrn 




Invoke : 


classjname x method.name 






X list{var .name) — >• instrn 




. . . , Cases : 


s X list{s X instrn.index) 






X instrn.index — >■ instrn, . . . 




. . . , Cond^t, : 


fun^^hooi X instrn.index 






— >• instrn, . . . 




• • • , • 


funw,s — >• instrn, . . . 




. . . , Loads * 


var. names instrn, . . . 




. . . , Stores : 


var .names instrn, . . . 




. . . , ConPlLoads : const. names — >■ instrn, . . . 




. . . , GetField 


s : field.names — >■ instrn, . . . 




. . . , GetStatiCs : class.name x field.names 






— >• instrn, . . . 




. . . , PutField 


s : field.names — >• instrn, . . . 




. . . ,PutStatiCs : class.name x field.names 






— >• instrn, . . . 


equations 







The Syntax of Programs As we shall explain in Section 4.7, we create 
programs from the instruction set by forming tables of individual instruc- 
tions, labelled by an appropriate indexing scheme (which we specify by 
INSTRNJNDEX). Onto this context-free superset of the syntax, we have 
to impose some additional constraints on the language to ensure that it 
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satisfies some non-context-free properties; for example that any branch 
instruction refers to an instruction within the same method. 

We now define the semantics of these instructions. 

4 Semantics of the Abstract JVM 

Using the operations provided by the architecture specification frame- 
work illustrated in Figure 3, we define the semantics of the individual 
instructions of the abstract JVM given in Section 3.4. 

This will allow us to define the semantics of abstract JVM programs; 
recall from Section 2 that to perform this task, we just need to define a 
next-state function. For this machine, we define our next-state function 

Next : jvmstate — >■ jvmstate 



by 



Next{a) = lFetch^^^^''^{a)j{a) 



so that Next (a) locates and executes the current instruction on the state 
(7. (We define the function Fetch^'^^^'^'^ that performs this location service 
in Section 4.7.) 



4.1 The Operand Stack 

As the JVM is a stack-based architecture, rather than being register- 
based, the operand stack is the hub of all activities concerning the ap- 
plication of functions to data. The data to be manipulated comes from, 
and is distributed to, the various repositories within the JVM: load in- 
structions deposit values onto the operand stack, and store instructions 
retrieve values. 



Structure At our high level of abstraction, we model the operand stack 
as a stack of values from the abstract data type A: 

OP BRAND. STACK {A) = STACK {A) 



Transferring Values We can either load a value onto the operand stack, 
or we can fetch a value from the operand stack and store it elsewhere. 
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Loading Values onto the Operand Stack A load instruction places a copy 
of a value stored in a location of the JVM on the top of the operand stack. 
In particular, there are instructions to load values from: 

— the local variables (Section 4.2); 

— dynamically created objects that are stored in the heap (Section 4.8); 
and 

— the constant pool (Section 4.7). 

We shall split the action of these instructions into two principle com- 
ponents: 

(i) first, we fetch the value from the specified location, and then 
(a) we place this value onto the operand stack. 

We exploit the independence of action (ii) to produce more modular speci- 
fications before dealing with the different fetching operations of action (i). 
To load a value onto the operand stack, we define a function 

Loads '■ s X jvm.state — >■ jvmstate 

so that Loada{v, a) pushes the value v onto the top of the operand stack 
of the state a: 

Loads{v,a) = i7^P^*“"^(a)), a) 

Fetching Values from the Operand Stack The store instructions work in a 
similar fashion, but this time we transfer a value from the operand stack 
to elsewhere. Again, we split the action of the store instructions into that 
of fetching and storing (but note that we have no instructions to store 
values in the constant pool, as by their nature these elements are static). 

In order to fetch an item from the operand stack, we need to be able 
to determine the top element of a stack and to remove it. Thus, we need 
to use the functions Topa and Pop a of the generic stack specification; for 
ease of notation we introduce functions 

Fetchs . jvmstate y ^ stack-under flow 

Removes ■ jvmstate — >• jvmstate 

that work directly on states: 

Fetchs{a) = Tops{LI^^^*^^^{a)) 

Removeaia) = PopXn^^^*^^\a)) 
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Notation To aid the clarity of later definitions, we extend each of our 
operations Loads, Fetchg and Removes to operate on lists with the func- 
tions 

Load^ : list{S) x jvm. state — >■ jvmstateerror 
Fetch^ : jvm.state list{S)stack_underfiow 
Remove^ : jvmstate — >• jvmstate 

which are all simply defined by recursion over lists, except that in addi- 
tion, the function Load™ checks the types of the list of arguments that it 
is presented with. 

Manipulating the Operand Stack We have three types of instructions 
in the JVM to manipulate the operand stack; we can pop values from the 
stack, and because the JVM is stack-based we have instructions to swap 
and duplicate values on the stack. 

We pop elements from the operand stack with the instruction 

Pop^ :— >■ instrn 



by: 

|PopJ(cr) = {Remove s{a)) 

We can swap the top two values (of sorts s and s' respectively) around 
on the operand stack with the instruction 

Swap^^^/ :— >■ instrn 

by removing them both from the stack, and then pushing them back on 
in the reverse order: 

[Swap^^^/](cr) = {Load'^ '‘^{Lstsi{Fetchsi{Removes{(i)), 

Lsts{Fetchs{(j), EmptyList))), 
Remove^'^ {a)) 

We can duplicate the top value on the operand stack with the instruc- 
tion 

Dup^ :— >■ instrn 
by: 

|DupJ(cr) = A^^{Loads{Fetchs{a),a)) 
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Performing Calculations The JVM operates by applying the functions 
of the underlying data type to arguments which are stored on the operand 
stack. 

We define the instruction 

Eval^u^s : fu7iw,s — t instrn 

so that |Evaltu^s(/)l((7) applies the function f : w ^ s to the arguments 
stored on the top of the operand stack, and replaces these values with 
that of the result: 

|Eval^,s (/)](cr) = {Loads {Apply^^sif, Fetck^ {a))), Remove^ (a)) 

4.2 The Local Variables 

We use the local variables area of the JVM to store the values of param- 
eters that are passed to methods, together with the values of variables 
that are local to a method. For this reason and because the JVM is a 
stack-based architecture, the only instructions acting on local variables 
are to transfer values between the local variables and the operand stack. 

Structure We model the local variables as a table indexed by the names 
of the variables (be they parameters or local variables), and storing their 
values: 

VARS{A) = TABLE{VAR_NAME{S),A) 

where VAR_NAME{S) specifies a set of variable names, which are typed 
with the sort set S of the underlying data type A. 

Retrieving Local Variables To retrieve values from local variables, we 
define a function 

FetchX°-''‘" : varjiamCs x jvm^state Suninitiaiised 

so that F etchY'^''^ (t, cr) fetches the value that is stored in the local variable 
X of the state a: 

FetchY^''\x,(j) = Reads{x,n^‘^''^a)) 

Now we can define the instruction 

Loads : var .names — t instrn 

so that [Loads (t)1 ( o ’) places a copy of the value stored in the local variable 
X on the top of the operand stack of the state a: 

[Loads ( t )](< j ) = A^i {Loads{FetchY‘^^^{x,a),a)) 
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Storing Local Variables Similarly, we define a function 

StoreY°'^^ : s x varjiamcs x jvmstate — >• jvmstate 

so that StoreY‘^^ {v , X, a) stores the value v in the local variable x of the 
state a: 

StoreY‘^''^v,x,a) = A^‘^''^Stores{v,x, n^‘^''\a))) 

We can now define the instruction 

Stores : varjnamCs — >■ instrn 

so that [Stores (r)](cr) transfers the value from the top of the operand 
stack to the local variable x of the state cr: 

[Stores (r)](cr) = A^i {StoreY^''^ {Fetchs{(j), x, Removes{(j))) 

4.3 The Execution Environment 

Within each frame, the execution environment stores information about 
the method currently executing (the method and class), together with 
information (the return instruction address) about the point where the 
previous method was interrupted to initiate the current method. 

Structure We store this information in the execution environment as a 
Cartesian product: 

EXEC.ENV 

= CP {CL ASS. NAME, METHOD. NAME, INSTRN .INDEX) 

where CLASS .NAME specifies a set of class names, METHOD .NAME 
a set of method names, and INSTRN .INDEX the indexing set for the 
instructions. 

Setting the Execution Environment We operate on the execution 
environment with a function 

SetEnv : class.name x method.name x jvm.state — >• jvm.state 

so that SetEnv{c, m, a) sets the execution environment up with the class 
c and method m as the new current values, together with the instruction 
address of where to return to (the instruction following the current one): 

SetEnv {c, m, a) 

_ ^ExecEnv ^^ExecEnv Plus{H^'^ {u), SuCc(Zero))), a) 

We set the environment upon the invocation of a method as we shall 
now see in the next section. 
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4.4 Frames 

A frame collects together the operand stack (on which we store the val- 
ues of (partial) computations), the execution environment (that we use 
to restore information to another frame when we have completed the ex- 
ecution of the current method), and the set of local variables (in which 
we store the values of local variables and parameters of methods). 

Structure We model an individual frame as a Cartesian product of the 
operand stack, the execution environment and the local variables: 

FRAME{A) = CP {OPERAND. STACK {A), EXEC. ENV, VARS{A)) 

We stack individual frames on top of each other to form the frames of the 
JVM: 

ERAMES{A) = STACK {FRAME {A)) 

The current frame is that which is at the top of this stack. 

Invoking methods When a method is invoked, we have to push a new 
frame onto the stack of frames and we have to store information about 
the point of interruption so that we can correctly resume the old method 
when the new method completes. 

To create a new frame, we need to introduce a function 

SetVars : 

dass.name x method.name x list {var. name) x jvm.state — >■ vars 

that will deal with the passing of parameters to a newly invoked method. 
In particular, SetVars{c, m, L, a) loads the current values of the list L of 
variables into the table VARS of variables that are declared as parameters 
by the method m of the class c. Note that this function also checks the 
types of the lists L of variables with the declared list of parameters. 

We define the instruction 

Invoke : class.name x method.name x list {var. name) — >• instrn 

so that [Invoke (c, m, L)](cr) invokes the method m of the class c on the 
values that are stored in the local variables specified in the list L: 

[Invoke {c,m, L)}{a) 

= {Startindex, {Push frame{v^^^^‘^ {Empty Stack, 

SetEnv{c, m, a), 
SetVars{c, m, L, cr)), 
i7^’’“™'="(cr)),cr)) 
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(where Startindex gives the first value of the indexing set used to locate 
program instructions). 

Returning from methods When we return from a method, we pop the 
completed frame off the top of the stack of frames, taking care to push 
the value from the top of the operand stack of the completed frame onto 
the cleared operand stack of the reinstated frame. Finally, we restore the 
value of the program counter from the execution environment; we define 
this action with an instruction 

Return^ :— >■ instrn 



by: 



|Returiis]((j) 

/\PC ^ YjReturnInstrn 

Loads {F etchs (a), 

/\OpStack^ Empty Stack, 



A 



Frames 



{PopF rame (n 



Frames 



(c^))w)))) 



Void methods do not return any value. We return from a void method 
with the instruction 

Return^oid t instrn 

which we define by: 



[Return„o,d] (cr) = 

{Empty Stack, 

^Frames ^JjFrames {a)),a))) 



4.5 Registers 

As we explained in Section 3.2, we only have one register, that of the 
program counter, in our abstract model of the JVM. This register controls 
which instruction we execute next. 

Being a low-level language, the JVM has basic flow-of-control mech- 
anisms to conditionally or unconditionally alter the value of the pro- 
gram counter or to execute a subroutine, as well as the high-level flow-of- 
control mechanisms of invoking or returning from methods described in 
Section 4.4. 
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Structure We consider that the program counter register only stores 
the address of the particular instruction of a given method and class 
that we are to execute next. (Recall from Section 4.3, that we store the 
information about the method and class in the execution environment to 
maintain links with the concrete JVM). 

We structure the registers as a Cartesian product 

REGISTERS = CP (PC) 
containing the single program-counter element. 

Unconditional Flow-of-Control The simplest flow-of-control mecha- 
nism is provided by the unconditional instructions of Goto and Nop, and 
also that of the Jsr instruction which is a more structured version of 
Goto. 

The Nop instruction has no effect on the state other than to increment 
the program counter. This is trivial to model: 

[Nop] (cr) = (cr) 

The Goto instruction simply specifies the next instruction to be exe- 
cuted by providing a new program counter address relative to the current 
instruction. We define 

Goto : instrri-index — >■ instrn 

so that |Goto(i)|(cr) changes the value of the program counter of the state 
cr by an offset i\ 



|Goto(i)](cr) = (Plus{n^'^ {a),i),a) 

The jump-to-subroutine instruction 

Jsr : instrri-index — >• instrn 

adds a given offset to a program counter, but so that execution can re- 
sume after the subroutine has completed, the address of the instruction 
following the jump-to-subroutine instruction is pushed onto the operand 
stack: 

[Jsr(i)|((T) = A^^ {Plus{n^^ [a),i), 

StoreinstrnJndexiPlus{n^^{a), Succ(Zero)), a)) 
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Conditional Flow-of-Control A more flexible flow-of-control mecha- 
nism is provided by the conditional instruction 

Cond^i, : fun^^booi x instrnJndex — >■ instrn 

so that [Cond^u(/, i)]((r) alters the program counter by the offset i if the 
predicate f : w ^ bool applied to the arguments which are stored on the 
top of the operand stack evaluates to true, and otherwise the program 
counter is simply incremented by one: 

Apply w,booi{f, FetcK^{a)) = True 

[Condtu(/, i)]((r) = Remove^{a)) 

Applyw,booiif , Fetch^ (a)) = False 

=> [Cond^„(/, i)](cr) = A^^ {Remove^ (a)) 

We also have an extended conditional instruction 

Cases : s x list{s x instrnAndex) x instrnAndex — >■ instrn 

so that [Cases (fc, L, d)](cr) searches the list L of pairs of matches and 
offsets for the first occurrence of a match equal to the key k, and returns 
the corresponding offset. If no such match is found, then the default offset 
d is used. 

In order to maintain the useful notion that each instruction can be 
completed within one time step, we introduce a function 

Cases '■ sxlist{sxinstrnJndex)xinstrnJndexxjvm.state — >■ jvm.state 
to describe the semantics of the instruction Cases. We define Cases by: 
Cases{k, Empty,d,a) = A^^ {Plus(F[^'^ (a),d),a) 
Equals s{k,m) =True 

Cases{k, Lst{{m,i), L),d,a) = A^'^ (Plus{F[^'^ (a),i),a) 
EqualSs{k,m) = False 

Cases{k,Lst{{m,i),L),d,a) = Cases{k, L,d,a) 

Then, we can define 

|Cases(fc, L, d)]((j) = Cases{k, L, d, a). 

4.6 Single Threads 

We can regard the threads area of the JVM as the control centre of the 
machine. It determines the order of the instructions that are executed, 
and it stores the intermediate results of computations that are initiated 
by these instructions. 
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Structure For simplicity, we just consider a single-threaded model. The 
thread of a state performs calculations on data; we store the results of such 
calculations in the objects on the heap. In particular, a thread consists 
of a program counter register that tells us which instruction we are to 
execute next, and a stack of frames which we use to store values needed 
in the calculation of the execution of methods. 

We model a thread as a Cartesian product of the registers and the 
frames: 

THREAD{A) = CP {REGISTERS, FRAMES (A)) 

4.7 The Method Area 

The method area of the JVM stores all the information pertaining to 
the classes of the Java bytecode programs. We record all the fields that 
are declared by the methods of a class, together with the actual program 
instructions. 

Structure In the method area we store the instructions that we execute. 
We structure the method area as a table of classes indexed by the class 
names: 

METHOD _AREA{A) = TABLE {CLASS .NAME , CLASS {A)) 
Within each class entry we store 

(i) the constant pool, where we store the constants of the class; 

(ii) the declarations of the fields that the class uses; and 
{in) the methods of the classes, which include the actual program instruc- 
tions. 

Thus, 

CLASS {A) 

= FCP {CONSTANT .POOL{A), FIELD. DECLNS{S), METHODS{E)) 

where the filter we need to apply is that of checking that all the constants 
and fields used in a method have been declared in the class in which it 
resides (an example of a completeness filter), and that these two entities 
are disjoint from each other. 

Field Declarations We declare all the fields of a class before we use 
them in the instructions; we store the values of the fields on the heap. 
There are two types of field: statics (per-class fields) and non-statics (per- 
object fields). 
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Structure We first split the fields into the statics and non-statics: 

FIELD _DECLNS{S) 

= FCP (STATIC. DECLNS{S), NON. STATIC. DECLNS(S)) 

where the filter we apply ensures that the static and non-static field dec- 
larations are disjoint from each other. Within each of these components, 
we represent the fields as a list of the field names: 

STATIC. DECLNS(S) = ELIST(EIELD.NAME(S)) 

NON. STATIC. DECLNS(S) = ELIST(EIELD.NAME(S)) 

where EIELD .NAME(S) specifies a set of field names which we type with 
the sort set S of the underlying data type A, and in both cases we ensure 
that the lists contain no repetitions. 

The Constant Pool We store the constants of the classes within the 
constant pool. (As our model is so abstract, this is all we need store in 
our constant pool.) 

Structure We structure the constant pool as a table indexed by the con- 
stant names, and storing their values: 

CONSTANT. POOL(A) = TABLE (CONST. NAME (S), A) 

where CONST .NAME (S) specifies a list of constant names which we 
type with the sort set S. 

Loading from the Constant Pool To retrieve values from the constant 
pool, we define a function 

Fetc/i-f : const.names x jvm.state SuniniUaiised 

so that Fetc/if a) returns the constant c that is stored in the 

constant pool of the state a: 

This allows us to define the instruction 

ConPlLoads : const.names — t instrn 

so that [ConPlLoads(c)]((r) places a copy of the value of the constant c 
from the constant pool on the top of the operand stack. 

|ConPlLoad,(c)l(a) = (Loa4(Fetc/if a), a)) 
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The Methods We store the program instructions of the JVM within 
the methods section of the classes. 

Structure We structure the program code by initially storing the instruc- 
tions for each method in a table 

INSTRNS{S) = FTABLE{INSTRN .INDEX, INSTRN{S)) 

indexed by INSTRN .INDEX-, this indexing scheme provides an isomor- 
phic copy of the natural numbers with the constant Startindex and func- 
tions Succ and Plus. 

The filter that we need to apply here is that any conditional or un- 
conditional branching instructions refer to an instruction within the ta- 
ble, i.e., branching instructions can only direct execution to instructions 
within the same method. This type of filter is an example of a check for 
completeness. 

With each method we associate its variables (local and parameter): 

METHOD {X) 

= ECP (PARAMETERS (S), LOCALS(S), INSTRNS(X)) 

where 

PARAMETERS(S) = FLIST(VAR_NAME(S)) 

specifies the parameters and 

LOCALS(S) = FLIST(VAR.NAME(S)) 

the local variables of methods as lists of variable names. Note that we 
apply filters to check that the parameters and locals are disjoint from 
each other and that both exhibit the property of distinctness. 

Then we store the individual methods in a table indexed by the 
method names: 

METHODS (X) 

= TABLE (METHOD. NAME (S U {void}), METHOD (X)) 

where METHOD .NAME (S U {void} ) specifies a set of method names 
which we type to indicate the return type of the method using the set S 
of the signature X, or if the method does not return a value, we indicate 
this with the sort void. 
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Operations The only operation on the bytecodes is that of accessing the 
individual instructions. We define a function 

: jvm. state — )■ instrn 

so that F etch^"'^^^"' (a) returns the instruction of the JVM that we are to 
execute next on the state a: 

Fetch^^^*^^{a) 

— FeadinstrniFl (^)5 

Read^^thod ^ ^ 

Read class (cr ) , 77 A^eihodArea ^ ^ ^ ^ 



4.8 The Heap 

Objects are created dynamically as instances of classes and are stored on 
the heap. 

We can load and store values in fields of objects. We can also create 
objects, but in our high-level view of the JVM we shall not consider any 
aspect of memory reclamation. 

Structure We store dynamic structures in the form of objects on the 
heap. At this level of abstraction, we regard the heap as a storage area 
for two types of information: we separate out the per-object information 
from the per-class information: 

HEAP{A) = CP {STATICS (A), OBJECTS{A)) 

Each class is associated with one set of static fields; we store these 
static fields as a table indexed by the class names, and storing the fields: 

STATICS(A) = TABLE{CLASS.NAME,FIELDS{A)) 

In turn, we store the fields as a table indexed by the field names and 
storing their values: 

FIELDS{A) = TABLE {FIELD .NAME {S), A) 

Each class may have an arbitrary number of objects associated with it. 
Hence, we store the objects in a table containing the non-static fields and 
indexed by some scheme (specified by OBJECT .INDEX) that uniquely 
identifies each object: 

OBJECTS{A) = TABLE {OBJECT. INDEX, NON. STATICS {Aj) 
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In addition, we require that we can generate a fresh reference upon request 
with a function 



NewRef : jvm.state — >■ object Jndex 



whereby 

Rc(ldT-iQ^_sfcHlQs{^N CW Re f (o^ TI ^ (^)) — 

As with the static fields, we store the non-static fields as a table 
indexed by the field names and storing their values: 

NON _STATICS{A) = FIELDS{A) = TABLE{FIELD_NAME{S),A). 

Creating Static Instances When we create an object, it is as the 
instance of some class, each of which has a unique name. 

Recall from Section 4.7, that we declare all the types of the fields of 
a class within the method area, and that we split the declarations into 
the static and non-static fields. When we create an instance of the class 
statics, we initialise all the fields to their default values with the function 

Default : list{field-name) — >• fields 

which is an extension of the function Def aultValuCs discussed in Sec- 
tion 3.2. 

We store the static instance of a class with a function 

StoreStatics : jvmstate — >■ jvmstate 

so that StoreStatics{a) places an instance of the static variables of the 
current class on the heap, if it has not already been stored: 

Equals{Read fields (cr) , ^ ^ 

Uninitialised fields) = False 
StoreStaticsia) = 



Equal s {Read fields (cr ) , 77 'S'iai*cs (^ ) ) , 

Uninitialised fields) = True 
StoreStatics{a) = a 
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Creating Objects As with the static fields, we initialise all the non- 
static fields of a newly created object to their default values with the 
function Default. We define a function 

StoreN onStatics : jvmstate — >■ jvmstate 

so that StoreN onStatics{a) places an instance of the non-static variables 
of the current class on the heap: 

StoreN onStatics{a) 

= {Storenon_statics{NewRef {a), 

Now we can define the instruction 

New :— >■ instrn 

so that [New] (cr) places an object on the heap of the current class and 
stores the index to the object on the top of the operand stack: 

|New|(cr) = A^^{Loadobject_index{NewRef{a), 

StoreStatics{StoreNonStatics{a)))) 

Loading from Objects We have two instructions to load field values: 
one for static fields, and the other for non-static fields. 

We define the function 

GetFields : field Jiameg — >• instrn 

so that |GetFields(/)]((r) loads the value of the field / of the object 
that is on the heap at the location determined by the top element of the 
operand stack: 

[GetField^(/)|(cr) 

— ^-t-l (^di/Oads{Reads(^f ^ R^ndjion-statics(^R^l'^l^object_index(^^^i 
RemOVeobject_index (^))) 

The process for loading static fields, i.e., fields of class instances is 
very similar: we define the instruction 

GetStatiCs : class -name x field JiamCa — >■ instrn 

so that [GetStatiCs(c, /)]((t) places a copy of the static field / of the 
class c on the top of the operand stack by: 

|GetStatiCs(c, /)|(cr) 

= A^^{Loada{Reada{f, Readfieids{c, (cr))), cr)) 
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Storing in Objects Similarly, we have two instructions to store field 
values, depending on whether they are static or non-static. 

To store non-static field values, we first introduce a function 

: s x fieldjiamcs x object Jndex x jvm.state — >■ jvmstate 

so that Storef^'^^'^{v, /, o, a) stores the value v of the field / of the object 
at position o on the heap: 

= (Stores (r>, /, Readnon_statics{o, {a))),a) 



We define the instruction 

PutFields : field JiamCs — >• instrn 

so that [PutFields (/)](cr) transfers the second item of the operand stack 
into the field / of an object which is determined by the top item of the 
operand stack: 

[PutFields(/)](cr) 

= {Storef^'^^^iFetchsiRemoveobject.indexicr)), 

/, FetchobJect_^ndex{(T), (a))) 

The act of storing a static field with the instruction 

PutStatiCs : class. name x field .names — t instrn 
follows similarly: 

[PutStatic^(c, /)](cr) 

= /, Readf,Ms{c, 77^*“**“ (a))), 

Removes{a))) 

5 Modelling the Concrete JVM 

Having specified the behaviour of our abstract JVM, we can now turn our 
attention to what would be required to model the concrete JVM. This 
process is a mixture of reducing the level of abstraction that we intro- 
duced, together with addressing the simplifications made to our model. 




272 K. Stephenson 



5.1 The Underlying Data Structures 

In our abstract model of the JVM, we just considered that it computed 
over some arbitrary abstract data type A. To model the concrete JVM 
(CJVM), we have to instantiate A with an appropriate data type. 

As will be the case in dealing with the concretisation of other features 
of the abstract JVM (AJVM), we shall find it helpful to add the details 
required in a step-wise manner. Thus, we consider the underlying data 
structures, then the underlying data type as a computational entity, and 
finally the representation of the data. 

The Specification Structures The specification structures of Sec- 
tion 2.3 also provide a suitable framework for the CJVM. We simply 
need to instantiate the underlying data type with appropriate concrete 
abstract data types (as indicated below) in most cases. 

To model implementations of the CJVM though, (i.e., to remove yet 
another layer of abstraction), more radical work (although essentially just 
exercises in data structures) is required; for example, the operand stack 
of the JVM is typically implemented as an array with a register recording 
the current top of the stack. 

The Underlying Data Type As can be seen from Figure 4, we take the 
data type A over which the AJVM computes and instantiate it so that it 
is constructed from the primitive types of BYTE, INT, SHORT, LONG, 
CHAR, RETURN .ADDRESS (which we termed INSTRN. INDEX in 
the AJVM), and REFERENCE (which we termed OBJECT .INDEX in 
the AJVM). 

We can then model each of these components at decreasing levels of 
abstraction. 

Data Representation In order for Java to be portable, the CJVM 
specifies how the data types are represented: 

— byte, short, int and long are signed two’s complement integers (of 
sizes 8 bit, 16 bit, 32 bit and 64 bit, respectively); 

— char as unsigned two’s complement integer (of size 16 bit); 

— float and double are IEEE 754 floating point numbers (of sizes 32 
bit and 64 bit, respectively); and 

— returnaddress and reference are stored using 16 bits. 

To maintain the benefits that our abstract models have introduced, it 
would be beneficial to model the JVM’s computation: 
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BYTE SHORT INT LONG CHAR FLOATDOUBLE 

Fig. 4. Structure of the underlying data type. 



(i) first using a completely abstract notion of word; 

(ii) then using the JVM’s abstract notion of word, where one word is suf- 
ficiently large to store values of byte, short, int, float, reference 
and returnaddress, and two words are sufficiently large to store long 
and double; and 
in) finally at bit-level. 

5.2 Dealing with the Remaining Simplifications 

Furnishing our model with concrete structures will not yield a full speci- 
fication of the JVM; we still have to deal with the simplifications that we 
introduced into our model. These simplifications fall into two categories: 
those that we have omitted so that we do not obscure the essential struc- 
ture of the JVM, and those which are problematical to deal with. 

In the first category fall instructions which are provided for: (i) imple- 
mentation efficiency reasons, (if) exception handling and (iii) type check- 
ing. 

We can only consider efficient versions of instructions at a concrete 
level where the local variables are indexed by (relative) addresses, rather 
than the abstract notion of names used in the AJVM. Then it is a sim- 
ple matter to extend the instruction set to include specifications of the 
semantics of load and store instructions which have an implicit index. 

We omitted details regarding the throwing and catching of exceptions 
in the AJVM as this is essentially an extension of method invocation. 

We also neglected to deal with issues in any way associated with type 
checking, which we have deliberately underplayed in this document. For 
example: we have not made any distinction between classes with regard 
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to interfaces or access restrictions; we have not considered the checkcast 
instructions; and we have not dealt with method resolution. 

The major area which we have not dealt with is that of multithreading, 
which raises the open problem of: “How can we lift the general algebraic 
approach of Section 2 to model a machine with multithreading?” 

6 Conclusions 

In this paper, we have analysed a hierarchical structure of computer sys- 
tems using algebraic methods. This algebraic modelling has been much 
studied at Swansea: at the microprocessor level (Harman and Tucker 
[1997] and Fox and Harman [1998]) through abstract models of computa- 
tion to high-level languages Stephenson [1996] and a study of hierarchical 
discrete-space, discrete-time systems (Poole et al. [1998]). This allows for 
an analysis of the correctness of implementation to be considered within 
a unified framework with the aim of supporting trusted compilation. 

The Java programming language is an ideal vehicle for our programme 
of constructing a unified formal path from high-level languages to hard- 
ware, as it employs an abstraction mechanism in the form of the Java 
Virtual Machine for its implementation. Thus, the gap between Java and 
the JVM is smaller (and therefore more tractable) than between Java and 
a more physically detailed hardware model. 

Work has already been performed in proving the correctness of a 
smaller skeletal compiler from a simple while language to an idealised 
machine model (Unlimited Register Machine) and has shown to be fea- 
sible (the proof has been performed by hand — Stephenson [1996]). In 
addition, this intermediate stepping stone of the JVM will allow more 
modular proofs to be constructed of the analysis of any particular imple- 
mentation. 

In this paper, we have concentrated on considering the specification 
of a case study of the Java Virtual Machine. We have shown that our 
specification techniques are capable of handling such a large example in a 
unified, comprehensive and practical fashion. The feasibility of specifying 
the JVM is, we feel, only made possible by exploiting all the possible 
abstractions we can make. 
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Abstract. A grid protocol models concurrent computation, and consists 
of one or more modules repeatedly performing parallel I/O and compu- 
tation. We provide several concise specification formats and correctness 
results on (external) I/O behaviour, and illustrate our approach by ex- 
amples. 

Note: Some of the results described in this paper were published earlier in 
[BHP97]; other results were established by students in the ’96/’97 course 
Process Algebra II delivered at the University of Amsterdam [BJM97], 
and in a master’s thesis [Pou97]. 



1 Introduction 

This paper surveys work done on specification and analysis of “grid protocols” . 
It is based on [BHP97], in which a simple class of these protocols is introduced, 
and on the papers [BJM97,Pou97], both of which deal with extensions. 

A Grid protocol models concurrent computation in a grid-like architecture. 
This type of protocols is based on Synchronous Concurrent Algorithms (SCAs) 
as developed by Tucker et al. [TT94]). Our motivation to follow this approach 
can be illustrated by the following citation {op. cit.): 

“many specialised models of computation possess the essential features of 
SCAs, including systolic arrays, neural networks, cellular automata and cou- 
pled map lattices. The parallel algorithms, architectures and dynamical sys- 
tems that comprise the class SCAs have many applications, ranging from their 
use in special purpose devices [...] to computational models of biological and 
physical phenomena.” 

* Partially sponsored by Esprit Working Group 8533 NADA — New Hardware Design 
Methods. 



B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 278-308, 1998. 
© Springer- Verlag Berlin Heidelberg 1998 
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A grid protocol consists of modules, i.e., data processing nnits that can cooperate 
with each other or the environment by passing valnes (terminology taken from 
op. cit.). This cooperation can be modeled in varions ways. In this paper we 
consider valne-passing by synchronization (commnnication actions). A modnle 
has fixed incoming and ontgoing channels or ports, modeling the connection with 
either one of the network’s modnles, or with some external device. Fnrthermore, 
it has an associated fnnction. The cnrrent valne of a modnle is either initial or 
compnted from its fnnction on the valnes received via its inpnt ports. In terms of 
behavionr, a modnle repeatedly performs parallel execntion of inpnt and ontpnt 
actions, each of which operates on a distinct channel. Having execnted all its I/O 
actions (Inpnt/Ontpnt) , the modnle npdates its cnrrent valne by application of 
its fnnction to the newly received valne(s). As an example, consider the modnle 
M in Fig. 1. 



(input channels) 

3 1 I 2 



(input channels) 



M(8) 



Vl 



V2 



1 1 

_J l_ 




1 1 

_l l_ 


+ 








+ 


1 
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M(5) 


1 

1 


8 






5 


1 1 
1 1 




1 1 
1 1 



(output channels) 



(output channels) 



Fig. 1. Example of the operation of a single module. 



In this fignre, modnle M has cnrrent valne 8, notation M{8) (displayed at 
the left-hand side). This valne can be sent along the two ontpnt channels, while 
valnes 3 and 2 are ready to be received along the two inpnt channels. The fnnction 
of this example modnle is to add the valnes received, thns the next cnrrent valne 
will be 5 (available at the ontpnt channels), and new valnes Vi and V 2 can be 
received. After this parallel I/O behavionr the modnle has evolved into M(5). A 
straightforward, recnrsive specification of M’s behavionr is 

M(d) = (II I/O value-passing actions) • M(e + /) 

in case d is the cnrrent valne, and valnes e and / are received as new inpnt valnes. 
Thronghont the paper, we stick to the convention that modnles are depicted by 
rectangnlar blocks, with inpnt channels coming in at the top and ontpnt channels 
leaving from the bottom. 
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A grid protocol can be a network of processors or (gronps of) points of mea- 
snre in some physical phenomenon, for example a hardware device or a vibrating 
string. In this paper we focns on modeling and characterization of grid protocols 
in process algebra. We offer some specification formats, and provide for each of 
these a general resnlt on the external behavionr of the network (characteriza- 
tion). We come np with two characterization resnlts on the external behavionr 
of grid protocols. These imply both corectness and freedom of deadlock: 

1. In the case that all modnles and internal channels of the network form a 
connected graph and the external behavionr is located at one single mod- 
nle, we obtain a simple characterization resnlt: the order of the (internal) 
synchronizations is not relevant and the network’s external behavionr — 
stream transformation or generation — is determined by a prefix of I/O ac- 
tions, followed by a simnltaneons valne npdate. This resnlt is established in 
[BHP97]. 

2. In the case that all modnles are synchronized by a device that keeps the 
modnles in pace, the operation of the network is characterized by a prefix of 
(external) I/O actions, followed by a simnltaneons valne npdate. In this case, 
also the external activity of the network is transformation (or generation) 
of parallel streams, irrespective of location of I/O and connectedness. This 
resnlt is established in [BJM97,Pon97]. 

We do not discnss algebraic details or proofs. For these we refer to the above 
mentioned references. 

We fnrther motivate onr approach as one that yields an operational perspec- 
tive on the module level, i.e., valne-passing by arbitrary interleaved synchro- 
nizations, and that relates this perspective with a correctness characterization 
of a network’s external I/O behavionr. Onr approach is based on a combina- 
tion of valne-passing calcnlns CCS (Calcnlns of Commnnicating Systems, Mil- 
ner [Mil89]), and process algebra ACP (Algebra of Commnnicating Processes, 
Bergstra and Klop [BK84,BK85,BW90]). Main ingredients are early read actions, 
in which a variable can get instantiated via commnnication (valne-passing) , and 
the process prefix, a generalization of Milner’s action prefix, introdnced in [BB94] . 
With these ingredients, a concise notation of parallel inpnt is possible. 

Structure of the paper. In the next section we introdnce valne-passing, process 
prefixes and early read actions. In Section 3, we give a specification format 
for (finitary) connected networks with I/O located at one port, and discnss 
a characterization resnlt. In the next section (4) we introdnce a second class 
of grid protocols: Beating Grid Protocols. These networks are controlled by a 
Beat-process, i.e., a synchronization device that keeps parallel I/O of the whole 
network in phase. For this class we consider two types of Beat-processes, and 
state a general characterization resnlt. In Section 5 we consider as an example an 
approximation of solntions of the one-dimensional wave equation, which can be 
modeled either as a connected grid protocol, or a beating grid protocol. Finally, 
in Section 6 we give some conclnsions. An appendix gives a brief introdnction to 
ACP’^(A, 7 ), the process algebraic approach nnderlying this paper. 
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2 Data and some Process Algebra 

In this section we explain value-passing, the basic commnnication mechanism 
in grid protocols. Fnrthermore, we introdnce early reads and process prefixing, 
and show that these form a concise notation for the mechanics of parallel valne- 
passing. On the fly we recall some process algebra. 



2.1 Actions, Value-Passing, and Generalized Merge 

Actions are the most basic processes we deal with. Fnrthermore, we consider 
handshaking commnnication between actions: the simnltaneons occnrrence of 
two actions fnses together to a new action. Snch an action is often called a 
commnnication action, and we assnme both the set of actions, and the commn- 
nications as parameters of onr theory. If two actions a and b commnnicate to 
action c, we can nse the communication merge \ and write 

a I 6 = c, 

and in case we are not particnlarly interested in c, we also say that a and b 
synchronize. If a and b do not commnnicate, we write 

a \ b = 5, 

where 5 is a symbol expressing deadlock or inaction. The commnnication merge 
is a commntative and associative operation on processes. 

We adopt a simple specification paradigm for processes parameterized with 
data, which originates from yuCRL [GP95]. As grid protocols process data, we de- 
mand compntability and decidability of all data involved (in the sense of [BT95]). 
Data parameterization is nsed in actions, snms, and commnnications. 

In order to specify valne-passing, let i,j be channel identifiers. Action Ti{t) 
models the act of receiving the particnlar valne t along channel i. Action Sj{t) 
models the sending of data valne t along port j. Here t may also be a prod- 
net of data valnes. Let for instance and sj be typed as actions that can 
carry valnes of type IN (the natnral nnmbers), and of type IN^, respectively. So 
ri(0), ri(l), ..., Sj(0, 0), Sj(0, 1), ... are considered actions. 

A first example of a data-parametric snm is the expression 

Ei,:In(d(i’)), 
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denoting a process that for an arbitrary valne n of E'J can once perform action 
ri{n), after which it terminates. Note that the type of the variable v is declared in 
the scope of the ^-operation. This expression represents the infinite snmmation 

»'i(0) +r-i(l) +ri{2) H 

where the commntative and associative operation + is called alternative compo- 
sition or choice, and defines the execntion of one of its operands. However, 

X + 5 = X. 

(So in context of alternative composition, 5 behaves as inaction.) Alternative 
composition binds weakest of all binary operations. 

A typical form of data-parametric snms concerns the combination with se- 
quential composition: x ■ y, or simply xy represents process x followed by y. For 
example, 

represents the infinite snmmation 

ri{0) ■ s,(0, 1) + ri(l) • sj(2, 2) + n{2) ■ Sj{4, 3) + • • • . 

The operation • is associative, and defines the seqnential execntion of its operands. 
However, 

5 ■ X = 5. 

(So, a ■ S deadlocks after execntion of a.) Seqnential composition binds strongest 
of all binary operations. Fnrthermore, note that 

x(y -h z) ^ xy -h xz. 

(For example, in case x = y = a and z = 5, the leftmost process eqnals aa, 
whereas the rightmost process can deadlock with aS.) On the other hand, we do 
have 



{x + y)z = xz yz. 

A form of repeated seqnential composition, employed in the specification of 
grid protocols, is no-exit iteration, introdnced in [Fok97] and defined by 

= X • (x)“. 

For data-parametric snms, axioms and aproofrnle are defined in [GP91c,GP94b]. 
In particnlar, these comprise a-conversion and axioms to change its scope. We 
adopt the convention that ^ (_) binds strongest of all operations, for example 

• Sj(2n,n + 1))“ = • Sj(2v,v4- 1)))“ . 

As for data-parametric commnnications, we assnme that the commnnications 
in Table 1, defining send-read communication, are the only ones defined. 
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Table 1. Send-read communication for value-passing, a,b € A. 



Ci{t) if {a,b} = {n{t),Si{t)} 
S otherwise. 



Data-parametric sums distribute over the communication merge |, provided 
no new bindings arise. Also + distributes over |. For example, 



ri{2) I {si{2) + sj{2,2))=n{2) \ Si{2)+n{2) \ sj(2,2) = a(2) + S = a(2), 
+ n(2)) I Si{2)=a{2), 

I *i(2) = I *i(2)) = Ci(2). 



With help of send-read communication and encapsulation one can easily 
model value-passing (cf. [Mil89]). Here encapsulation is defined by an operation 
that renames all actions in H into 5, 



9h 




5 if a € Ff, 

a otherwise. 



Furtmermore, encapsulation distributes over •, +, and ^ (_). 

Encapsulation does not distribute over merge operations, and can be used 
to enforce communication between concurrent processes, such as value-passing 
communications. Rather than considering processses P \ Q of which the first 
action must be a communication between P and Q, concurrent execution of P 
and Q is specified as 



^11 Q, 

where the merge operator || is defined by ACP-axiom (CMl) 
x\\y = {x\\_y + yW_x)+x\y. 

Here [j_ , leftmerge, is an auxiliary operation that requires that the first action 
performed stems from the left operand. It is axiomatized by 



(CM2) a [j_ X = a • X 
(CMS) ax [j_ y = a(x || y) 

(CM4) {x+y)^z = x^z + y^z 

where a is an action, 5, or r (a constant explained below). So in P || Q, the first 
action comes either from P, from Q, or is a communication between P and Q. 

Now, for a small, typical example on value-passing involving encapsulation, 
consider 



R = ■ Sj{2v,v+1))‘^, 
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a process that is willing to receive any natnral k along channel i, then sends the 
pair {2k, fc + 1) along channel j, and then resnmes this behavionr. Assnme that 
R is execnted in parallel with 

Si (5) • S, 

a process that initially sends the valne 5 along channel i. The valne-passing of 5 
from Si (5) ■ S to R along channel i can be represented by 



d^ri,si}{R II Si(5) • S) 

where we adopt the notation only mentioning the identifiers Vi,Si, from 

yuCRL. Hence, single ri{n) and Si(n) actions cannot occnr and are thns enforced 
to commnnicate. We derive 



II Si(5) • S) = • Sj{2v,v + 1)) • R || Si(5) • S) 

= Ci(5) • a{^,_^,}(sj(10,6) • R II S), 

where the second identity follows from the axioms of ACP’^(A, 7 ) and those 
for data-parametric snms. So, the value-passing in this example is modeled by 
the execntion of the commnnication action Ci(5) and the resnlting process is 
9{r;,s;}(sj(10, 6) • R II S). In the setting of yuCRL, a detailed treatment of this 
valne-passing format can be fonnd in [GK95]. 

The distinction between internal (nnobservable) and external (observable) 
behavionr is modeled with the hiding operation t/, where 7 is a set of (internal) 
actions. This operation renames the actions in I into r, the silent action: 

( i = / if a e 7, 

( a otherwise, 

and distribntes over •, +, and ^ (_). For the constant r, an important axiom 
is X ■ T = X. 

Grid protocols are specified as the concnrrent composition of a finite nnmber 
of modnles, and for readability it makes sense to nse the generalized merge 



liei 



which abbreviates the expression (Pi^ || Pi^ || ... || Pi^) for 7 = {ii,i 2 , a 

non-empty, finite set of indices. This notation can be jnstified by commntativity 
and associativity of the || operation. If 7 is a singleton, say 7 = {*i}. 



bG{ii} J 

We often write 



= Pi, 



rather than 



The following resnlt follows easily by indnction on n, exploiting rx [j_ y = 
t{x II y) and xt = x, and is typical for the process algebraic reasoning that 
nnderlies onr characterization resnlts. 



Lemma 2.1.1. 



i=l 



TVi 



i=l 



Vi 
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2.2 Parallel Input: Early Reads and Process Prefix 



Let D be some data type. We introduce the binary operation process prefix and 
early read actions as a means to provide concise notation for parallel input. Let 
i be a channel or port identifier, and v a variable of type D. Then 



erfivfix = Y.vjoi^i{v) ■ x) 

is the axiom scheme that introduces the early read action erfiv) and the opera- 
tion the process prefix. This identity is a yuCRL-like interpretation of the early 
read axiom in [BB94]. It is meant that v may occur in process x, e.g., 

erfivfisjiv) = Y.v,Di^i{v) ■ sj(v)) 
is an expression without free data-variables, and so is 



eri(v);sj(t) = 



for t a closed term of type D. 



Remark 2.2.1. The axiom scheme above reflects Milner’s translation of the basic 
CCS term a(x).E into the value-passing CCS term 



a^.E{v/x} 

vev 



where V is the value set and the translation function [Mil89]. 

Let Agr be the extension of A (the set of atomic actions) with early read 
actions for any action : D\ x ... x D„ declared over A. Axioms for process 
prefixing are given in Table 2. The axiom PP4 is considered to be parameterized 
with the type of the action. Note that for the er actions we use globally typed 
variables. 



Table 2. Early input and process prefixing, a € A. 



(PPl)5;x = 5 I (PP4) erk{v);x = ■ x) 

(PP2) T\x = T ■ X I (PP5) {x y) \ z = X] z y \ z 

(PP3) a-, X = a ■ X \ (PP6) {x ■ y); z = x; {y; z) 
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By the send-read communication paradigm (see Table 1) we have that eri{v) \ 
a = 5 for all a € A^r- This is used in the following example, in which parallel 
input is unraveled {v,w variables of some type D): 

(eri(v) II erj(w));si(F(v,w)) 

{eri{v) lj_ erj{w) + erj{w) [j_ eri{v) + eri{v) \ erj(w)); si{F{v,w)) 

(CF2) 

= (eri(v) ■ erj{w) + erj{w) ■ evi (v) +S);si{F{v,w)) 

(PP5) 

= (eri(v) ■ erj{w))-,si{F{v,w)) + (erj(w) ■ eri{v))-,si{F{v,w)) 

(PP6) 

= eri(v)]{erj(w)]Si{F(v,w))) + er^(w); (eri(u); s;(F(u, w))). 

Applying axiom PP4 to the last expression yields that 

f ■ si{F{v,w)))) 

(eri(v) II erj(w));si(F(v,w)) = < + 

showing conciseness of notation with early read actions and process prefix. Ob- 
serve that in case we replace w by u, 

(eri(v) II erj(v));si(F(v,v)) = < + 

I ■ Sl(F(v,v)))). 

This example models non- deterministic input along one of the channels i,j, 
yielding send-action si{F{k, k)) with k being the value received. In the peculiar 
case that the channel identifier j is replaced by i, we find that 

[ ■ si{F{v,w)))) 

(eri(v) II eri(w));si(F(v,w)) = < + 

models the sending of either si{F{j, k)) or si{F{k, j)) if values j and k are sent 
along i. Finally, 

(eri(v) II eri(v));si(F(v,v)) = (eri(v) ■ eri(v)); si(F(v,v)) 

describes the situation in which the first inputed value along channel i is ne- 
glected, and the second one instantiates F{v,v). 

Remark 2.2.2. We stress that the process 

II -«')))) 

does not model parallel input: if for example D = {0, 1}, 

II rj{w))si{F{v,w)))) = (ri(0) || rj(0)) • s;(F(0,0)) -h 
(ri(0) ||r,(l))-s,(F(0,l)) + 

(ri(l) ||r,(0))-s,(F(l,0)) + 

(r,(l) ||r,(l)).s,(F(l,l)). 
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So, upon reading the first value along channel i or the choice for the second read 
action is already fixed. 

Furthermore, t/ and 9ij-applications also apply to er-actions via axiom PP4, 
using the yuCRL axioms that state that these applications commute with the 
^-operation. For instance, 

d{n } (eri (v);Pi + erj (w ) ; P 2 ) 

= Jlv.D(9{ri}(ri(v)Pi)) + J2^o-.TNid{ri}(rj(w)P2)) 

= Ev.d(^) + 

= 5 + erj(w);dfri}(f’2) 

= erj(w);d^^.y(P2). 

3 Modules and Connected, single-I/O Networks 

In this section we propose a specification format for connected networks. Such a 
network consists of modules, elementary data processing units which may have 
feedback. Next we introduce connected networks as a format for the parallel 
execution of such modules. Our modeling is based on [TT94], in which SC As 
(Synchronous Concurrent Algorithms) are analyzed. 

3.1 Modules 

A module Mi has a (current) value d, a fixed (positive) number n of input 
channels f 1 , and a fixed (positive) number m of output channels Oi , . . . , . 

Channels have unit bandwidth and are unidirectional; this corresponds with our 
format for value-passing as discussed in the previous section. We first consider a 
setting with only one data type D. Computation in Mi is modeled by a (total) 
value function Fi : D. The complete operation of module Mi{d) can be 

described as follows: 

/ n m \ 

Mi{d) = [ II eri.{vj) || || . (d) ] ; Mi{Fi{vi, (1) 

VLi=i J Li=i \J 

Unfortunately, this definition presupposes that Mi has no feedback, i.e. that 
{ii,...,in} n {oi,...,o^} = 0 because early read actions do not communicate 
(otherwise, the value in question would get lost). So for the particular case that 
Mi also has a feedback channel /, its definition should be something like 

/ n m \ 

Mi{d) = ( II eri.{vj) || || . (d) ] ; Mi{Fi{d,Vi, ...,v^)), (2) 

VLi=i J Li=i \J 

where the first argument of Fi models the feedback. This has two disadvantages: 
we lose uniformity of specification, and the feedback action is not explicit, only 
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its effect is. As can be expected, we allow at most one feedback channel per 
modnle. 

Becanse nniformity of specification is particnlarly nsefnl in proving correct- 
ness, we adopt a “low level specification format” of modnles, and obtain (1) as 
a derivable identity. Becanse feedback is an explicit, bnt hidden action in onr 
specification format, we postpone the derivable variant of (2) till after the intro- 
dnction of the operation of networks. Modnle Mi{d) as described above is defined 
by means of two iterative processes. The first one of these defines the recewe-part 
Reci of the modnle (modeling its read actions), the second its send-part Sendi(d) 
(ready to send the valne d along the ports Oi, ..., o^). These two parts commn- 
nicate along some channel i, internal to modnle Mi, and are also nsed to model 
compntation of Fi. More precisely, commnnication of a valne Fi{di, ...,d„) by 
(internal) action c{Fi{di, ...,d„)) can only take place if all parallel read actions 
of ReCi have been execnted, and if also Sendi(d) has performed all its (parallel) 
send actions (d). This yields the following specification and pictnre of Mi{d): 



Mi{d) = o d[ri,si}{ReCi \\ Sendi(d)), 



ReCi = 
Sendi(d) = 
Sendi = 



Li=i 



eri.(vj) 



Li=i 
eri(v); 



Soj (d) 



i=i 



Si{Fi{vi,...,v„)) 
Sendi, 

Sn.- (v) 



Mi(d) 



ii iji 




In case Mi has no feedback, i.e., n {oi,...,o^} = 0, it follows that 

Mi{d) has a process prefix 

(erij(ni) || ... || eri^(vn) || Soi(d) || ... || So„(d)) 

(this is a conseqnence of Theorem 3.3.3). After having read certain valnes di,d^, 
...,dn along channels i\, and having sent d along ports Oi , . . . , , the mod- 

nle’s cnrrent valne is npdated to Fi{di , d„) by a valne-passing commnnication 
along channel i, renamed into the silent action r. After this, the next process 
prefix is ready to be performed: 

i^^il (^1 ) II • • • II II {d\ , d^i)) II ... II {Ri {dl 5 • ■ ■ ; ^n))) • 

For readability, we introdnce the following abbreviation for synchronization 
and abstraction over some port i: we fnrther write 

P ||i Q instead of o d^ri,si}(P II Q)- 
Henceforth, Mi{d) = ReCi ||i Sendi(d). 
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Because the specific typing of the read and send actions is not relevant, 
except that the function Fi must be compatible with it, we further consider a 
many-sorted setting. We assume that each variable is uniquely typed. 



3.2 Connected Networks 

A network is a finite collection of modules, in which the read/send connections 
respect the typing of the corresponding read and send actions. A general re- 
striction is that there is at most one channel for transmission between any two 
modules Mi and Mj. In particular, we do not allow merging of channels, or 
more than one feedback channel per module (case i = j). Note that branching of 
channels is modeled by taking different send actions. In this section we consider 
connected network specifications of the form 

Tl o 8 h 

Here, connectedness refers to the graph which has the modules as nodes, and 
the (undirected) channels as arcs. In the specification above, the du application 
models value-passing synchronizations between modules Mi, and the t/ 

application models hiding of the resulting communication actions. We further 
say that the I/O of a network denotes its external actions, i.e., read or send 
actions that have no communication partner within the network. A network 
is single-output if its I/O consists of exactly one output action, which will be 
referred to as 

^out (•••)• 

Below we recall an example taken from [BHP97] for computing a Fibonacci 
sequence using a connected single-output network consisting of modules M\ and 
M 2 ■ This example also illustrates the particular way we deal with feedback. 

Example 3.2.1. Consider the following network in which all values to be 
passed are of type IN, and in which a channel name ij indicates that values are 
transmitted from module Mi to module Mj: 
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We can specify these modules by the following two iterative processes Mi (n) 
and 

Mi(n) = Reci ||i Sendi(n), 

Reci = {er2i(v); Si(w))“, 

Sendi(n) = {soutifi) || Si 2 (n)) • Sendi, 

Sendi = {eri(v);(sout(v) || Si 2 (t^)))“, 

and 



M2(m) = Rec2 II2 Send2(m), 

Rec2 = i(eri2(vi) || er22(«2)); S2(«i +^ 2 ))“, 

Send2{m) = {s2i{m) || S22(m)) ■ Send2, 

Send2 = {er2{v)\ {s2i{v) || S22{v))Y ■ 

Let I = {c 2 i,Ci 2 ,C 22 } and H = {r 2 i , S 21 , »'i 2 , S 12 , »' 22 , S 22 }- The Fibonacci Net- 
work 



TiodaiMiil) II M2(1)) 

computes the ordinary Fibonacci sequence 1, 1, 2, 3, 5, 8, ... as the values of its 
consecutive Sout~^ctions: 

T • T/ o ajj(Mi(l) II M2(1)) 

— ^ ’ ^out(Tj • 



where the leftmost t’s smooth the difference between the networks first possible 
actions: either Sout(l) or r resulting from some internal value-passing. A different 
characterization is given by the equation 

T • T/ o dH{Mi{n) II M 2 {m)) = r • Sout{n) ■ t/ o ajj(Mi(m) || M 2 {n + m)), 

from which it is immediately clear that r • t/ o 9ij(Mi(l) || M 2 (l)) computes the 
Fibonacci sequence. This equation can easily be grasped from the picture above; 
its correctness follows from Theorem 3.3.1 presented in the following section. 

3.3 Characterization of Connected, Single-I/O Networks 

In this section two correctness resnlts on connected single-I/0 networks are 
recalled ([BHP97]), the second of which is a generalization of the first. We show 
by an example how a simple case of the first resnlt can be proved. 

Theorem 3.3.1. Let n > 1, d = di, ...,d„ be a sequence of typed values, and 
let 

N{d) = Tj o 8 h 
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be a network that is connected and single- output, where Mi is the output- 
module. Then 

T■N{d)=T■Sout{dl)■N{Fl{dl),...,F„{dr^)) 

where Fi is the value function of module Mi, and d{ abbreviates di^,...,di^^ 
whenever Fi computes on the values of modules Mi., , Mi,^ , respectively. 

For the case n = 1, the proof of the theorem is trivial. We spell out a most simple 
instance. This also shows how a derivable variant of (2) — i.e., a data-parametric 
definition of a module with feedback — would look like. 

Example 3.3.2. Consider the following network N (d) where d € IN, containing 
one module 

Mi{d) = Reci 111 Sendi{d) 
that generates a stream: 




Here 

-Reci = ■ si{v + 1))“, 

Sendi = E«:In(’'i(^)(sii(^) II 

Sendi(d) = (sn(d) || Sout(d)) • Sendi. 

Let H = {riijSii}. The behaviour of dji(Reci ||i Sendi(d)) can be analyzed as 
follows: 

9jj(Mi(d)) = dniReci ||i Sendi(d)) 

= cii(d) • dnisiid + 1) • Reci ||i Sout{d) ■ Sendi) 

+ Sout{d) ■ dniReci ||i sn(d) • Sendi) 

= cii{d) ■ Sout{d) ■ dnisiid + 1) • Reci ||i Sendi) 

+ Sout{d) ■ cii{d) ■ dnisiid + 1) • Reci ||i Sendi) 

= cii{d) ■ Sout{d) • ci(d + 1) • dH{Reci ||i Sendi{d + 1)) 

+ Sout{d) ■ cii{d) • ci(d + 1) • dH{Reci ||i Sendi{d + 1)). 





292 J.A. Bergstra and A. Ponse 

Let I = {cii}, then it follows from the derivation above that the one-module 
network 

N{d) = T/ o dH{Mi{d)) 
for some d € IN satisfies 

N{d) = T ■ Sout{d) ■ N{d + 1) + Sout{d) ■ N{d + 1). 

Hence 

T ■ N{d) = T ■ Sout{d) ■ N{d + 1), 
expressing that r • N(d) outputs the infinite stream 



^ ^out(.d^ Sout(.dL 1) ^out (d + 2) • 



In general, a connected, single-ontpnt network with more than one modnle, all 
modnles bnt the ontpnt modnle can be partitioned in a nnmber of connected snb- 
networks that perform I/O with the ontpnt modnle only. From this perspective, 
the correctness Theorem 3.3.1 can be easily proved. The reader interested in a 
proof is referred to [BHP97].^ 

We can relax the conditions of Theorem 3.3.1 nnder which the execntion of 
a network satisfies a single process prefix, followed by a recnrsive npdate of its 
data state. A first generalization concerns the ontpnt actions of a connected, 
single-ontpnt network. It is not hard to see that the previons correctness resnlt 
is preserved if snch a network ontpnts actions of the form 

Sout(F(d)) 

for some fnnction F rather than Sout{d). We call this output modification of the 
ont-channel. 

A second generalization concerns additional external ontpnt of the network. 
Assnme a network 

N{d) = Tj o 8 h 

has more than one ontpnt channel, and that I is snch that all extra ontpnt 
channels not located at the I/O modnle are hidden. Then N (d) is single-1/0 if 
all its I/O activity (its collection of external read and send actions) stems from a 
single modnle, the 1/ 0-module. Onr most general resnlt on the class of connected 
network is the following: ^ 

^ We remark that the last condition in the definition of Reci, i.e., m €. Ri\R' Xm € 
, should be skipped. 

^ Cf. [BHP97], though we exploit the Expansion Theorem and alphabet axioms (both 
recalled in the Appendix) to obtain a nicer formulation. 
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Theorem 3.3.3. Let n > 1, d = di, be a sequence of typed values, and 

let 



N(d) = Tj o 8 h 






Mi{di) 



he a network that is connected and single-I/0, where Mi is the I/O-module and 
Extern is the set of indices of the I/O channels. (Notice that Extern 0 and 
may hide output from non-I/O-modules.) 



Then 



where 



■ N{d) = ■ 



i^Extern 






Tl O 8h 



i=l 



M/E/di)) 



1. Function Ei is the value function of module Mi, 

2. For i > 1, di abbreviates di., , ..., di,^ whenever Fi computes on the values of 
modules Mi, , Mi,^ , respectively, 

3. For i € Extern, either a/x/) = s/Giid/)) where Gi is the output modifica- 
tion of channel i, or a/x/) = er/v/), 

4. Sequence di is defined similar, except for its Extern-coordinates (see the 
clause 3). 

This result gives way to regarding networks as stream transformers, be it that 
the I/O connection is located at a single module. In particular, this allows one 
to connect single-I/0 networks with each other while preserving a simple cor- 
rectness characterization. We apply this theorem in Section 5. 



4 Beating Grid Protocols 

In some cases the restriction to single-I/0 networks is too strong. If for example 
one wants to model the operation of a simple 5i?-latch in process algebra (or 
RS Flip-Flop, cf. Section 5.1.2 in [TT94]), we obtain a network with multi-I/0. 
Below we depict an 5i?-latch in two typical states: irrespective of the value b, 
output at Q is 0, respectively 1. The components below symbolize nor-ports 
(where nor{x,y) = 1 — max{x,y)). 
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The format for connected networks discnssed in the previons section does not 
comply to the intended operation of this network: it does not imply that S is 
evalnated before R starts its second evalnation. In this section we obtain the 
following type of characterization for multi-I/0 networks, expressing that the 
I/O is performed per cycle: 

T • N (^d) = T • ((II all I/O value-passing actions) ; N (^Fii^di')')') 

i.e., we want to view network operation in a similar way as modnle operation, 
performing I/O in consecntive phases. A way to achieve this is to assnme a global 
synchronization device, the Beat process. We consider two different options for 
the definition of snch a device. 



4.1 Synchronized Modnles 

We follow the approach described in Section 3.1, bnt extend modnle specification 
with two external synchronization points with a Beat process, and one extra 
internal synchronization point. Another difference is that we associate a nniqne 
variable with each inpnt channel: 



Mi{d) = Reci ||i Sendi(d), 



Reci = ri 



Li=i 

Sendi(d) = rh.{s) ■ Si 



eriAvi.) 



1 { j • • • 7 )) 



i=i 



,(d) 



• Sendj, 



Sendi = eri{vi); ( rt^ (e) • rt^ (s) ■ Si 



Li=i 



SoiiVi) 



and 



Beat = 



Sbi(s) 



Sbi(e) 



Beat = 



Sfei (s) • Sfei (e) 



provided we consider a network with n mod- 
nles. 




The idea is that Mi{d) starts with a Beat-commnnication Cb.{s), wherenpon 
ReCi and Sendi(d) synchronize with a c^-cornmnnication and can start their 
parallel inpnt and ontpnt actions. After this, ReCi and Sendi synchronize by 
valne-passing action Ci{Fi{di)), and “end-synchronization” with Beat can take 





Grid Protocol Specifications 295 



place by a C],. (e) communication, by which the module evolves into Mi{Fi{di)) . 
In the following section we further explain the synchronization actions between 
Mi and Beat. 



4.2 Networks with a Beat 



Again a network is a collection of modules as defined above, in which the 
read/send connections respect the typing of the read and send actions. As in 
the previous section we adopt the restriction that there is at most one channel 
for transmission of data from a module to a module (but feedback is allowed). 
For an n-module network, we consider two different definitions for the Beat pro- 
cess as given above. So, the action Sh.{s) gives module Mi permission to start 
its parallel I/O, and Sh. (e) signals end of this activity. By definition of send-read 
communication, these actions yield communications Ch.{s) and Ch.{e). Note that 
the second definition of Beat is most liberal, as it covers each execution of the 
first one. (The converse is not true: all Sh. (s) actions must take place before an 
Sbi{e) action can occur.) 

In this section we consider network specifications of the form 



Tl o 8h 



Mi{di) 



Beat 



in which the du application models value-passing synchronizations between mod- 
ules Ml, ...,Mn and Beat, and the t/ application models hiding of the resulting 
communication actions. In this case, the rhythm of the Beat guides the operation 
of the network. 

Below we give an example for computing the operation of an 5i?-latch using 
a beating grid protocol. 



Example 4.2.1. SR-Latch. Consider the following network in which all data 
to be transmitted are in {0,1}. A channel name ij indicates that values are 
transmitted from module Mi to module Mj: We put the branching of output of 
the R, Q-module explicit in the module (corresponding with the restriction that 
channels have bandwidth 1): 
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The precise specification of Mi{bi) and M2 (62) is as follows: 

Mi(6i) = Reci 111 Sendi(bi), 

Reci = (ri • [(erR(vR) || er2i(v2i)); Si(nor(vR,V2i))])‘^ , 

Sendi{bi) = rb^(s) ■ Si • (sq(6i) || Si2(&i)) • Sendi, 

Sendi = (eri(v); [rt^ (e) • rf,j (s) • Si • (sq(v) || Si 2 (w))])“ , 



M2{b2) = Rec2 II2 Send2{b2), 

Rec2 = (ri • [(ers(ws) || eri2(wi2)); S2(nor(ws, ^ 12 ))])“ , 

Send2{b2) = rb^is) ■ S 2 ■ S2i(&2) • Send2, 

Send2 = (er2(v); [rb^{e) ■ rb^(s) ■ S2 ■ S2i(t^)])“ • 

Let I = {c2i,ci2} and H = {r2i , S21 , »'i2, Si2}- We argue that 
Ti o 9jj(Mi(0) II M2(1) II Beat) 

computes the operation of an SR-latch per two cycles. First we analyze the 
behaviour of 

G{b,,b2)= riodH{M,{b,) II M2{b2)) 

in a graphical style. The characterization theorem for beating grid protocols, 
which we present below, yields that 

T-G(bi,b2) =T- ((erR(ci) || ers{c2) || sq(6i)); G(nor(62, ci), nor(6i, C2))) . 

In order to further analyze this behaviour we use a graphical style, deleting r- 
steps and only showing the input-value pairs {cr,cs). The output value 61 of 
sq(6i) is characterized as the first value of G(_, _); 




G(l,l) 
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As argued in [TT94], the SR-latch behaviour can be traced back if one as- 
sumes that input is offered twice per cycle to the network. Let L{b \ , 62 ) represent 
the appropriate G-states, and use the following coding of input pairs: 

set for (0, 1) • (0, 1) (output 1 at Q) 

reset for (1, 0) • (1, 0) (output 0 at Q) 

hold for (0,0) • (0, 0) (output at Q the same as in the previous cycle) 

Then L(0, 0) has the intended behaviour, as follows from the behaviour of G(0, 0) 
as analyzed above: 




reset 



So, L(l, 0) is the set-state, L(0, 1) is the reset-state, and also L(0, 0) is considered 
as possible initial state. 



4.3 Characterization of Beating Grid Protocols 

In this section we state onr final correctness resnlts on networks. As before, we 
allow output modification of the external ontpnt-actions. 

Theorem 4.3.1. Let n > 1, d = di, ...,d„ be a sequence of typed values, and 
let 

N(d) = Tj o 8 h ^ II Mi(di) II Beat^ 

be a beating grid protocol, with synchronized modules. Furthermore, let Beat 
be defined in one of the following ways: 



1. Beat = 



2. Beat = 



Sfei (s) • Sfei (e) 



T-N(d)=T- II afixi) ; T/ o I || MfiFfidi)) \\Beat\, 

i^Extern \ i=l / 



where 
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1. Extern 7 ^ 0 is the set of indices of the I/O channels, 

2. Function Fi is the value function of module Mi, 

3. For i € Extern, either a/x/) = s/Giid/)) where Gi is the output modifica- 
tion of channel i, or a/x/) = er/v/), 

4. The sequence d{ abbreviates Ui,, , Ui,^ whenever Ei computes on input 
channels i\, ..., ik^ ■ 

Various inductive proofs of this result [BJM97,Pou97] use a second charac- 
terization that covers the case that a network has no external connections. We 
apply this characterization result in the next section. 

5 An Example: the Wave Equation 

In this section we specify a given algorithm for approximation of a wave equation 
in a single-output and connected network. The description of this example is 
taken from [BHP97]. 

5.1 The wave equation 

The linear homogeneous partial differential equation 
OF dx'^ 

is known in wave mechanics as the one- dimensional wave equation; it describes 
transversal propagation along the x-coordinate or amplitude y{x,f) of a wave. 
This equation models for instance vibrations in a string, where it is required that 
the tension in the string is approximately constant. The constant c is defined 
by Ip, where T is the tension in the string and p the string mass per unit 
of length. In solutions y{x,f) the constant c is interpreted as the propagation 
velocity of the wave in transversal direction. 

In order to solve the wave equation, boundary and initial conditions are 
needed. As boundary conditions we assume that y{0,t) = y{l,t) = 0 for t > 0, 
i.e. that the string is fixed in x = 0 and x = 1. With these boundary condi- 
tions a string amplitude at some time t, as a function of x, may be graphically 
represented as in Fig. 2. 

In case we have y(x,0) and dy/dtjt^o given as initial conditions for 0 < x < I, 
it is possible to derive an approximation of y{x,Af), where At is a very small 
time interval. The values y(x,0) and y{x,Af) are used for the initialization of 
an algorithm that numerically solves the wave equation. 

Let V be a natural number, and Ax = l/N a very small length interval. We 
define 

F{z \ , Z 2 , Z 3 , Z4) = 2zi — Z2 + (c— — )^(z3 — 2zi + Z4), 

Ax 

and 



y/t + At) = E{yi{f),yi{t - At), (t), (t)). 
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From numerical analysis it is known that yi{t) approximates y{iAx,t) for 
1 < i < — 1, and t > 2At (see e.g. [Smi65,FJL“*"88]). Therefore, the above 

equation for yi{t + At) may serve as a basis for numerical approximation of 
solutions of the wave equation. 

Now an algorithm for calculating wave amplitudes yi{t) may be designed 
which uses one processor per sample point on the x-axis, i.e., one for every i and 
one for each boundary. As a result the calculations for the string amplitude at 
some sample moment t will be carried out by + 1 processors in parallel. In 
fact, A^ — 1 processors will suffice, since the values at the sample points i = 0 
and i = N are already known from the boundary conditions. 

In the next section we specify a connected, single-output grid protocol that 
models this approximation. For simplicity we assume that Ax and At are given, 
and that there is no interaction between a user of the algorithm and the algo- 
rithm itself; the algorithm just produces an infinite stream of outputs. Of course 
we need a criterion for correctness; we require that the algorithm outputs ap- 
proximations of the total string amplitudes on the successive sample moments: 

y{0),y{At),y{2At), ... 

where y{t) abbreviates yo{t), ...,yi\f{t). Other requirements are that the algo- 
rithm contains no deadlocks or livelocks, so that it is always able to proceed. 
We will see from one simple equation on the external behaviour of the algorithm 
that these three requirements are satisfied. This equation immediately follows 
from the correctness theorems presented earlier. 

5.2 Grid Protocols modeling the Wave 

In the previous section we established the following equation for the calculation 
of the value of coordinate yi at time t + At: 

yi{t + At) = F{yi{t),yi{t - At),yi_i{t),yi+i{t)). 
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Modeling this approximation as a grid protocol obscnres explicit reference to 
time. All that is left is that compntation of the amplitndes at time t + At depends 
on the amplitndes at time t and time t — At, and that onr protocol is modeled in 
snch a way that all ontpntted amplitndes are compnted in consecntive phases. 
So any reference to time t mnst be interpreted as to a certain compntation phase. 

The eqnation above shows that the cnrrent valnes (at time t) of coordinates 
i/i, Vi-i, and yi+i are needed, as well as the previons valne (at time t — At) of 
coordinate yi. Given these valnes, the fnnction F calcnlates the new valne (at 
time t + At) of yi . When we model the approximation of the wave eqnation as a 
grid protocol, we need a nnmber of processors, each calcnlating the consecntive 
valnes of one or more coordinates as floating reals. We choose to let one processor 
calcnlate the valnes of one coordinate. For A^ + 1 coordinates, we thns define A^+1 
processors Pq, ..., Pn- Each processor Pi (0 < i < N) needs the following inpnt: 

— the ontpnt of processor Pi-\, 

— the ontpnt of processor Pi (itself), 

— the ontpnt of processor Pi^i , and 

— the previous ontpnt of processor Pi (itself). 

Natnrally, processors Pq and TV do not need inpnt at all. However, for reasons 
of nniformity we also nse channels from Pi to Pq and from Pn-i to Pn- The 
last item above reqnires that we store the ontpnt of each processor for one time 
slot. This is, however, not possible in a single modnle. We solve this problem 
by splitting each processor Pi into a calcnlating modnle Mi and a delay modnle 
Di. The delay modnle does nothing more than storing the ontpnt valne of the 
calcnlating modnle for one time slot. After that, this valne is sent back to the 
calcnlating modnle. We can now state that the inpnt of each modnle Mi (0 < 
i < N) shonld be: 

— the ontpnt of modnle Mi-i, 

— the ontpnt of modnle Mi (itself) , 

— the ontpnt of modnle , and 

— the ontpnt of modnle Di . 

This can be visnalized as follows: 



- - - -\ r 




We can proceed in two ways: 
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1. We can define a connected, single-ontpnt grid protocol. Here an additional 
ontpnt modnle transmits the valnes of all ‘processors’, and hence acts as a 
synchronizing device, 

2. We can define a single-ontpnt beating grid protocol, collecting ontpnt directly 
from the ‘processors’ described above. 

We describe the first alternative in detail. It shonld be immediately clear how 
to model the second approach. Now that we have composed a grid protocol 
modeling the wave eqnation, we can start writing a specification in the early- 
read format. This is not difficnlt: jnst read what happens from the pictnre. To 
start with, we specify the Di (0 < i < N) modnles: 

Di(e) = o II SDi(e)) 

RDi = 

SDi = 

SDi(e) = ■ SDi. 

Here, e:r^Mi,Di)Y) and s^jj. Mi)Y) stand for an early read or a send action on 
the ports connecting Mi and Di. Note that (Mi,Di) is the port from Mi to Di 
and (Di,Mi) the port from Di to Mi. The actions and stand 

for an early read or a send action on the internal port of the concerning modnle. 
Likewise, we specify the modnles Mi, nsing the following shorthands: 

Ini = || er(Di,Mi){v2) || || er(Mi+i,Mo(^’4)) 

Oi{x) = {s(Mi,Mi){x) II s^Mi,Di){x) II II II S(Mi,0)(a:)) 

Now, 



Mi{d) — ° II Si{d)) 

Ri = [lui ; S(Mi){F{'V\,V2,Vii,Vi))Y 
Si = {er(Mi)(v) ; Oi(w))“ 

Si(d) = Oi(d) ■ Si. 

The port {Mi, O) is the actnal ontpnt port of the processor, leading to an ontpnt 
modnle O. 

The processors Pi (0 < i < N) can now be defined as follows: 

Pi{d,e) = Mi{d) II Di{e), 

with d and e the initial valnes of coordinate yi {yi{At) and yi{0), respectively). 

For N + 1 coordinate pairs, — 1 of these processors are conpled together, 
the onter ones also nsing two border processors (which are simple modnles) . The 
ontpnt of all the calcnlating modnles Mi (0 < i < N) in the processors is sent 
to ontpnt modnle O. This modnle collects the compnted valnes of all processors 
and bnndles them in a vector. This bnndling is somewhat arbitrary; alternatively 
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O may send its output in parallel to the environment. In a picture: 




As one can see from this picture, the first and the last processor only commu- 
nicate with their neighbor and the output module O. The specification of these 
two processors is, therefore, very simple: 

Po{d) = o II So{d)) 

Ro = (er-(Mi.Po)(^); S(Po)(0))“ 

So = (er-(p„)(n); II S(p„,o) ( t^)))“ 

So{d) = (s(Po,Mi)(c^) II S(Po,o){d)) ■ So 

PN{d) = o II SN{d)) 

Rn = (er(M„_i,p„)(w);s(p„)(0))“ 

Sn = (er(p„)(n); (s(p„,Mjv_i)(^') II S(Piv.o)(t')))“ 

SN{d) = {s {PN,MN.i){d) II S(^pj^ o){d)) ■ Sn- 

Note that Po and Pn need not to be split in a calculating and a delay module. 
Since we describe a wave through a string with both ends tight, the output value 
of processors Po and Pn will remain zero all the time: 

Po{d,e) = Po{0) and PN{d,e) = Pn{0). 

The only thing left to specify is the output module O. Let d = di, ...,djv, 
then 



0{d) = ° S{r^o),sio)}(PO II SO{d)) 

( II / r 

i?0 = II er(po){vi) 

VVLiG{0.AT} 

SO = (er(o)(tn);Sout(i«))“ 

SO(d) 

^out (d) ■ SO. 



er(Mi,o)(vi) 



;s(o)(«) 
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Now the algorithm can be specified by the parallel composition of O and all 
processors Pi'. 



( 0{y{k ■ At)) 



WAVE = WAVE(O), 



N 



Pi(yi((k + 1) • At),yi(k ■ At)) 



with yi{0),yi{At) (i = 0, ■■■,N) arbitrary initial valnes, and p ranging over the 
following set of ports: 

{{Mi,Mj),{Mi,Di),{Di,Mi),{Mi,0) \0<i,j<N} U 
{{Po,M,),{M,,Po),{Po,0),{Mn-i,Pn),{Pn,Mn-i)APn,0)}. 

The external behavionr of the algorithm can then be expressed by 

T • WAVE = T • Sout(y(0)) • Sout{y{‘^t)) ■ Sout{y{‘2.At)) ■ ■■■ . 

This follows from Theorem 3.3.1, which gives the following characterization of 
onr specification: 

T • WAVE(fc) = T • Sout{y{k ■ At)) ■ WAVE(fc + 1). 



A beating grid protocol modeling this algorithm need not depend on a syn- 
chronizing ontpnt modnle like O defined above. By Theorem 4.3.1 we cam omit 
O and view the send-actions to O as external. Eor the resnlting protocol WAVE' 
protocol one has a choice in the definition of its onpnt valnes: either those of 
Mi, yielding 



T • WAVE' = T 



N 



SMi,o{yi{^t) 



N 



SMi,o(yi(2At) 



not giving ontpnt at time 0, or reading the valnes from the Di modnles instead. 
In the latter case, the modnles Mi lose their ontpnt action SMi,o{--), and the 
Di shonld get an extra ontpnt action sOi,o{--)- We then obtain ontpnt also at 
time 0, and hence the two (reqnired) initial confignrations of the string (at time 
0 and At). 



6 Conclusions 

It appears that early read and process prefix form a nsefnl extension to ACP- 
based specification formalisms. Grid protocols — intended to model parallel com- 
pntation — are considered for two types of architectnres: 

— Strongly connected networks in which I/O is located at one modnle. Here 
internal compntation need not proceed in lock step. By connectedness and 
I/O interface located at one modnle, I/O proceeds in lock step. 
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— Networks in which modules need only be strongly connected to a synchro- 
nizing device: the beat (for instance a global clock). Both I/O and internal 
actions, i.e., all parallel activity, proceed in lock step. 

Future work concerns a more formal treatment of substitution, extensions in 
the field of asynchronous networks, and establishing a precise relationship with 
protocol specification and verification in yuCRL. 

We finish with some remarks on the proposed specification format for grid 
protocols, comprising early reads, process prefix and no-exit iteration. It is not 
essential whether one uses full yuCRL or some other data-parametric, recursive 
specification format, or the extension of ACP with data and no-exit iteration w 
as presented in this paper. Restricting all occurring types of networks to single- 
module networks, one finds by the characterization results that an identity such 
as 



M{d) 




actions] 



M{F{v)) 



can just as well be regarded as the (data-parametric) specification of a module. 
In that case, transformation to the specification format as discussed before (with 
Lo and distinctive read and send parts) yields a specification in which character- 
ization is relatively easy to prove. Note that the yuCRL perspective yields two 
basic types of modules, both in the setting without and with a beat process: 
those without feedback, as displayed above, and those with feedback, having a 
value-update of the form F{d,v) (and the possibility of an initial r-step if one 
cares to model the feedback as explicit activity, which from an operational point 
of view seems best) . In the specification format discussed in this paper feedback 
is treated in the same way as other value-passings, which seems a preferable sort 
of modeling. 

Finally, one can show that for the internal communication actions it is suf- 
ficient to assume that all outputs are in parallel; input may have a fixed order. 
Because all internal output is performed in parallel this cannot raise any dead- 
lock, which also is a consequence of the characterization results: both approaches 
reduce to the same external characterization. This appears to be useful for (more) 
efficient proto-typing of grid protocols (cf. [Hil96,Pou97]). 



References 

[BB94] J.C.M. Baeten and J.A. Bergstra. On sequential composition, action prefixes 
and process prefix. Formal Aspects of Computing, 6(3):83-98, 1994. 

[BBK87] J.C.M. Baeten, J.A. Bergstra, and J.W. Klop. Conditional axioms and aj(5 
calculus in process algebra. In M. Wirsing, editor. Formal Description of 
Programming Concepts - III, Proceedings of the 3*^ IFIP WG 2.2 working 
conference, Ebberup 1986, pages 53-75, Amsterdam, 1987. North-Holland. 

[BHP97] J.A. Bergstra, J.A. Hillebrand, and A. Ponse. Grid protocols based on syn- 
chronous communication. Science of Computer Programming, 29:199-233, 
1997. 




[BJM97] 

[Pou97] 

[BK84] 

[BK85] 

[BT84] 

[BT95] 

[BW90] 

[FJL+88] 

[Fok97] 

[GK95] 

[GP91c] 

[GP94b] 

[GP95] 

[Hil96] 

[Kle56] 

[Mil89] 

[PVV95] 

[Smi65] 

[TT94] 



Grid Protocol Specifications 305 

E. van Buiten, M. de Jonge, and R. Monajemi. Beating Grid Protocols, 
PAII-thesis, University of Amsterdam, 1997. 

M. Pouw. Beating Grid Protocols, Master’s Thesis, Utrecht University, 1997. 
J.A. Bergstra and J.W. Klop. The algebra of recursively defined processes 
and the algebra of regular processes. In J. Paredaens, editor. Proceedings 
ICALP, Antwerpen, volume 172 of Lecture Notes in Computer Sci- 
ence, pages 82-95. Springer- Verlag, 1984. An extended version appeared in 
[PVV95], pages 1-25, 1995. 

J.A. Bergstra and J.W. Klop. Algebra of communicating processes with 
abstraction. Theoretical Computer Science, 37(1):77-121, 1985. 

J.A. Bergstra and J.V. Tucker. Top down design and the algebra of commu- 
nicating processes. Science of Computer Programming, 5(2):171-199, 1984. 
J.A. Bergstra and J.V. Tucker. Equational specifications, complete term 
rewriting systems, and computable and semicomputable algebras. Journal 
of the ACM, 42(6):1194-1230, 1995. 

J.C.M. Baeten and W.P. Weijland. Process Algebra. Cambridge Tracts in 
Theoretical Computer Science 18. Cambridge University Press, 1990. 

G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker. Gen- 
eral Techniques and Regular Problems, volume 1 of Solving Problems on 
Concurrent Processors. Prentice-Hall International, 1988. 

W.J. Fokkink, Axiomatizations for the perpetual loop in process algebra. In 
P. Degano, R. Gorrieri, and A. Marchetti-Spaccamela, editors, Proc. 24 th 
Colloquium on Automata, Languages and Programming - ICALP’97, pages 
571-581. Lecture Notes in Computer Science Vol. 1256, Springer- Verlag, 
Berlin, 1997. 

J.F. Groote and H. Korver. A correctness proof of the bakery protocol in 
/iCRL. In [PVV95], pages 63-86, 1995. 

J.F. Groote and A. Ponse. Proof theory for /tCRL. (Extended version.) 
Report CS-R9138, CWI, Amsterdam, 1991. 

J.F. Groote and A. Ponse. Proof theory for /tCRL: a language for processes 
with data. In D.J. Andrews, J.F. Groote, and C.A. Middelburg, editors. 
Proceedings of the International Workshop on Semantics of Specification 
Languages, pages 232-251. Workshops in Computing, Springer- Verlag, 1994. 
J.F. Groote and A. Ponse. The syntax and semantics of /tCRL. In [PVV95], 
pages 26-62, 1995. 

J.A. Hillebrand. A simple language for the specification of grid protocols 
(working title). Technical Report, Programming Research Group, University 
of Amsterdam, to appear. 

S.C. Kleene. Representation of events in nerve nets and finite automata. In 
Automata Studies, pages 3-41. Princeton University Press, 1956. 

R. Milner. Communication and Concurrency. Prentice-Hall International, 
Englewood Cliffs, 1989. 

A. Ponse, C. Verhoef, and S.F.M. van Vlijmen, editors. Algebra of Commu- 
nicating Processes, Utrecht 1994, Workshops in Computing. Springer- Verlag, 
1995. 

G.D. Smith. Numerical Solution of Partial Differential Equations, Oxford 
University Press, 1965. 

B. C. Thompson and J.V. Tucker. Equational specification of Synchronous 
Concurrent Algorithms and architectures (Second Edition). Report CSR 
15-94, University of Wales, Swansea, 1994. 




306 



J.A. Bergstra and A. Ponse 



Appendix 



In this appendix we recall some basic process algebra (without explicit use of data): 
the system ACP’^(A, 7 ), standard concurrency, and no-exit iteration. Furthermore, we 
recall expansion and alphabet axioms, all of which are essential for specification and 
verification of grid protocols. 

The process algebraic framework ACP’^(A, 7 ) (AGP with branching bisimulation) 
has two parameters: a set A of constants modeling atomic actions, and a (partial) 
binary, commutative and associative communication function 7 on A, defining which 
actions communicate. Furthermore there are constants S (deadlock or inaction) and r 
(silent step). Process operations in ACP’^(A, 7 ) are alternative composition or choice 
(+), sequential composition (•), parallel composition or merge (||), left and communi- 
cation merge ( [J_ and |, used for the axiomatization of ||), encapsulation (9 h), and 
hiding (tj). We mostly suppress the • in process expressions, and brackets according 
to the following precedences: • > {||, U_ , |} > +• Process expressions are subject to the 
axioms of ACP’^(A, 7 ), displayed in Table 3 {x,y,z,... ranging over processes). Note 
that + and • are associative. 

We provide a slightly modified version of ACP(A, 7 ) (i.e., the axioms Al-7, CF1,2 
and CMl- 8 ), comprising commutativity of || and |, defined by the (new) axiom CMC 
(Communication Merge is Commutative). As a consequence, the symmetric version of 
CM5 and CMS, i.e. CM 6 and CM9 respectively, are left out (cf. [BW90]). Furthermore, 
we adopt associativity of || and |. Commutativity and associativity of these operations 
is known as Standard Concurrency [BT84], and is referred to as SC. 

In this paper we only considered two-party communication or handshaking (see 
[BT84]), which is axiomatized hy x \ y \ z = 5. 

We give some informal explanation on the use of process algebra. Often, + is used as 
an operation facilitating analysis rather than as a specification primitive: concurrency is 
analyzed in terms of sequential composition, choice and communication. Verification of 
a concurrent system 9 h(Ci || ... || C„) generally boils down to representing the possible 
executions with + and •, having applied left-merge (|J_ ), communication merge, and 
encapsulation (9 h, by which communication between components Ci can be enforced). 
After renaming internal activity to the silent, unobservable action r with help of the 
hiding operator r/ (also called ‘abstraction’), this may yield a simple and informative 
specification of external behaviour. For a detailed introduction to ACP’^(A, 7 ) and SC 
we refer to [BW90]. 

In order to describe iterative, non-terminating processes we use the unary operation 

perpetual loop or no-exit iteration, for which NEIl in Table 3 is the defining axiom. 
This operation is introduced by Fokkink in [Fok97]. In that paper, several completeness 
results are established, among which the fact that BFA (axioms A1-A5) with NEIl 
and RSF‘^ characterizes strong bisimilarity. (The ‘missing’ axiom NEI2 concerns the 
empty process e, and reads {x + e)‘^ = x‘^ .) It should be remarked that RSF‘^ is not 
sound in the setting with the silent step r. For example, each process r • P satisfies 



T ■ P = T ■ T ■ P, 



though T ■ P = is of course a very undesirable identity. 
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Table 3. Axioms of ACP’^(A, 7) and for no-exit iteration, where a,b € As,t, H,I C A. 



(Al) 


X + y = y + X 


1 (Bl) 


XT 


= X 


(A2) 


X + {y + z) = {x + y) + z 


1 (B2) 


x(r(j/ + z) +y) 


= ^{y + z) 


(A3) 


X + X = X 


1 






(A4) 


{x + y)z = xz + yz 


1 






(A5) 


II 


1 






(A6) 


X + S = X 


1 






(A7) 


Sx = S 


1 

1 






(CFl) 


a \ b = j(a, b) if j(a, b) J, 


1 

1 






(CF2) 


a \ b = S otherwise 


1 

1 






(CMl) 


a: II J/ = (a: IL J/ + J/ L 2 :) 


1 

1 (Dl) 


dH{a) 


II 




+ x\y 


1 (D2) 


dH{a) 


= S if a 6 Ff 


(CM2) 


a [J_ X = ax 


1 (D3) 


Oh (x + y) 


= Oh (x) + Oh (y) 


(CM3) 


ax\\_y = a{x || y) 


1 (D4) 


8h (xy) 


= Oh (x) • Oh (y) 


(CM4) 


(x + J/)[J_ 2 : = xU_ 2 : + J/[J_ 2 : 


1 






(CMC) 


II 


1 (Til) 


r/(a) 


II 


(CM5) 


ax 1 6 = (a 1 b)x 


1 (TI2) 


r/(a) 


= r if a € / 


(CM7) 


ax \by = {a\ b){x || y) 


1 (TI3) 


Ti (x + y) 


= r/(x) +Ti{y) 


(CMS) 


(x + j/)|z = x| 2 : + j/| 2 : 


1 (TI4) 


Ti {xy) 


= Ti{x) ■ Ti{y) 


(NEIl) 


= X • (x)^ 


1 (NEI3) 


9h(x‘^) 


= (9h(x))‘" 


(RSP‘") 


X = y ■ X X = 


1 (NEI4) 


Ti(x‘^) 
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The proofs of the characterization results as described in this paper employ the 
Expansion Theorem (cf. [BW90]). This theorem holds in the setting of handshaking: 

forn>3: [ = E"=i L 

+ 

Also, a lot if (intermediate) results depend on application of alphabet axioms. Under 
certain conditions the scope, or the action set of r/ or 8h applications can be changed, 
depending on the alphabet of a process. In Table 4 we give some axioms to determine 
the alphabet of process P, notation ct{P). Except for AB 6 , these axioms stem from 
[BBK87]. 



II Pi • 



II Pi 



Table 4. Alphabet axioms, a € A. 



(ABl) a(5) = 0 = a{T) \ (AB4) a(ax) = {a} U a(x) 

(AB2) a{a) = {a} | (AB5) a{x + j/) = a{x) U a{y) 

(AB3) a(rx) = a{x) \ (AB 6 ) a(x‘^) = a(x) 



Starting from the alphabet of a process, the conditional alphabet axioms in Table 5 
(also taken from [BBK87]) give conditions for changing either scope or action sets /, H 
of Ti and 8h applications. Here B \ C for B,C ^ A denotes the set {a € A \ a = 
7 ( 6 , c) for some b € B,c € C}. 



Table 5. Conditional alphabet axioms, H,I <Z A. 



(CAl) a{x) I (a{y) n H) C H 
(CA2) a{x) I {a{y) n /) = 0 
(CA3) a{x) nH = <H 

(CA4) a{x) n / = 0 



8h{x II y) = 8h{x II 9h(j/)) 

II y) = II Ti{y)) 

8h (x) = X 
Ti (x) = X 
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Abstract. The aim of this chapter is to dehne a simple analogue hard- 
ware description language L and give it a sound semantics that supports 
formal reasoning about its properties. 

The syntax of L is that of a hybrid programming languages but the se- 
mantics has been derived from the analogue signal semantics of the up- 
coming IEEE VHDL-AMS extension to the IEEE standard digital hard- 
ware description language, VHDL [1]. 

L will here be given two semantics. Firstly, what may be termed an ex- 
act, or hardware, semantics and secondly an aproximation, or simulation, 
semantics. The simulation semantics is computable and the hardware se- 
mantics is not. It will be shown that the simulation semantics approxim- 
ates the hardware semantics in a well-dehned sense. This property is a 
“no surprises” guarantee with respect to simulation for the language. 



1 Introduction 

The forthcoming discussion will be motivated by means of a small puzzle: an 
ideal operational amplifier is an electronic device that monitors two continuously 
varying input signals xi, without disturbing them and, insensitive to loading, 
produces an amplified difference signal on its single output y. The natural way for 
a computer simulation to represent the amplifier in this open loop configuration 
is as a simple, causal, input-to-output computation (Fig. 1). 



r-foo 

y{t)=(] {xi{t - t) - X2{t - T))h{T)dT 

Jo 



= l 3 h*{xi - X2){t) 



( 1 ) 



in which h{t) is the impulse response of the amplifier and it is zero for negative 
times. * is the convolution operator Usually the impulse response is taken to be 
a decaying exponential with gradient p: 

( pe~^^ , T > 0 

\0, r<0 

B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 309-332, 1998. 

© Springer-Verlag Berlin Heidelberg 1998 
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Fig. 1. The open loop operational amplifier computes its output y causally from 
its inputs. 



and for the ampliher to be useful, the amplihcation [3 has to be greater than 1. 
A value of several thousand is normal. 

Computational simulations of the ampliher do the integration numerically, 
using an approximation grid on the discrete points of which *, h and y are 
evaluated (3). 

OO 

J/(C'+i) = f3 S (®i(C'-j) ~ ®2(C'-j))^(^i)(^i+i ~^j) (3) 

j=0 

But that computation produces strange results if the simulation output is fed 
back in negative phase to its input, putting the ampliher in the classical closed 
loop (follower) conhguration (Fig. 2, Eqn. 4). 




Fig. 2. The closed loop operational ampliher feeds its output back to its neg- 
ative input. 



OO 

y(ti+i) = 13 S (xiiU-j) - y{ti_j))h(tj){tj+i - tj) (4) 

3=0 

In this physical conhguration, the ampliher output should follow the input faith- 
fully, lagging by an amount related to the lag observed in the open loop conhg- 
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uration. However, that is not the behaviour that is observed from the simulation. 
Moreover, it is not an artifact of simulation. The integral equation (5) that corres- 
ponds to the simulation is unstable - it only has solutions that grow unboundedly 
in time. 



y{i) = J (*i(^ - - r))/i(r)dr (5) 

0 

It can be shown that causal solutions to (5) must have the form (6) 

y = j3h*xi - {I3h* fxi -f {!3h* fxi T . . . (6) 

In this series, contributions from longer and longer ago in the input xi have a 
greater and greater effect on the output y, and that is quite contrary to physical 
intuition. The output does not have a Fourier spectrum even when the input 
signal is very well-behaved (zero for negative times, bounded, with bounded 
absolute and square integrals over time, and with a Fourier spectrum). 

Since the open-loop ampliher is simulated correctly, the conclusion to be 
drawn from failure to simulate the closed loop ampliher given a simulation of the 
open loop ampliher is: 

1. either the closed-loop ampUfier cannot be simulated compositionally in terms 
of simulations of the open loop amplifier and a feedback wire; 

2. or the behaviour of the open-loop amplifier is not represented adequately by its 
impulse response h, or the feedback wire behaviour is likewise not adequately 
represented] 

3. or the method of composinq descriptions of components to yet a description 
of the whole is as set out here is not correct. 

It is the second item in this list that is correct. The ampliher is capable of many 
more behaviours than given by the causal equation (1). The full set is described 
by a differential equation (7) to which (1) with h{t) = pe~^^ is the solution in 
the open loop case: 



-^ = -y{y - P{xi- X 2 )) (7) 

The equation (7) may be understood by supposing that a “perfect” open loop 
ampliher impulse response is given by the Dirac delta centered at zero: 



y = fi{xi - X 2 ) 

and that momentary departures from this ideal may be expected in a real amp- 
liher under the application of rapidly changing external signals that the output 
cannot follow quite rapidly enough. In the event of such a departure, a “restoring 
force” acts to restore the ideal and it applies itself to y in proportion p to the 
departure y — j3{xi — * 2 ), with opposite sign. 

In the closed loop case, the equation (7) then becomes 
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— = -^{y - I3{x - y)) (8) 

with solution: 

^ = /“(l + /3) (9) 

that describes a lagging follower, as expected. Physically, the constant y = 1/ RC , 
where R and C are respectively the internal resistance and capacitance of an 
equivalent circuit for the ampliher (Fig. 3). The component represented by the 
circle in the Figure is an inhnite impedence voltage generator. 




Any network of classical components each of whose voltages and currents 
satishes a linear second order differential equation may be represented by a mat- 
rix of simple hrst order differential equations in the voltages and currents in the 
network, calculated by Kirchoff’s laws (the sum of the currents entering a circuit 
node is zero, the voltage felt at a node is the same for all the components touching 
it). So we may infer that: 

1. a complete circuit behaviour is described by differential equations that are 
composed from the differential equations describinq its components beha- 
viours] 

2. and analoque component behaviour is represented adequately throuqh differ- 
ential equations] 

3. and differential equations are composed by identifyinq their shared variables. 
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The use of differential equations to describe physical systems is a classical 
modelling method. It is only with the advent of computers that it has become 
more usual to model physical systems as computations, and it comes as a surprise 
to computational scientists that computation is not neccessarily an adequate 
substrate for the representatuon of the operational ampliher. 

Section 3 sets out the concrete syntax of a small language L that can capture 
descriptions of circuits via differntial equations. To represent changes in the gov- 
erning equations, the language permits computations to decide which to apply at 
any moment in time. The language is a cut-down version of the upcoming mixed- 
signal extension - called VHDL-AMS - to the IEEE standard digital hardware 
description language VHDL [1]. 

It will be shown that although small, L can express the constructs of VHDL- 
AMS that it explicitly lacks. It is also capable of expressing the atomic constructs 
of hybrid programming languages. The latter have developed from foundations 
in computer science and aim to provide a rigorous basis for the development of 
real time systems controlled by software. VHDL-AMS has developed from efforts 
by the hardware community to codify their existing designs as computer models. 
The separate aims and backgrounds of these two communties have given rise to 
very different development tracks that may nevertheless be converging. 

The aim of this chapter is to provide 

1. a non-computational semantics for the core language, given in terms of clas- 
sical solutions to differential euqations; 

2. for each constant of precision S, a computational semantics for the core lan- 
guage, in domain theoretic terms; 

3. a demonstration that as d tends to zero, that the computational semantics 
tends to the classical semantics given Rrst in a well-defined way. 

The computational domain of in which these discussions are set is presented in 
Section 4. 

In Section 5 a nonstandard evaluation of logic is set out in which truth is 
a real valued quantity. This is appropriate for real- valued problems. The logic 
will be used in implementing the small description language that is defined in 
Section 3, and an ideal semantics for which is given in Section 6. The semantics 
is a set of state-space trajectories for the system being described, each of which 
is a piecewise smooth function of time. 

Section 7 sets out the computable approximating semantics for the language, 
based on a discrete grid of time points, and shows that the approximate state 
descriptions tend to the ideal descriptions, this fulfilling a “no surprises” contract. 
Section 8 describes an implementation of the approximating semantics and gives 
some experimental results. 

2 A Computational Approach 

In the following section a way of representing switched analogue circuits as hy- 
brid programs will be set out. A hybrid program is an imperative program that 
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switches state between different regimes governed by different differential equa- 
tions according to the resulst of certain computations. For examplem, the oper- 
ational ampliher will be represented by the process code: 

proc amp (in: x_l, x_2; out: y) { 

y = 0; 

wait until false with 

dy/dt = n * (j - (] * (x_l - x_2)) ; 

} 

Note that the signals xi and *2 are called by reference - they are updated by an 
exterior process whilst this process runs. The intention is that when this process 
is called with parameters 

amp(x, y; y) 

then it should automatically behave as the closed loop ampliher is supposed to 
behave - as a faithful signal follower. 



3 Language 



The precise syntax of the small language illustrated by the amp process is given 
below. There are only two forms of atomic statement; assignment and waits: 



Circuit ::= Process* 

Process ::= Pid ( in: Var* ; out: Var* ) { 

[ var Var* ; ] Statement 

} 

Statement ::= Var = Expr ; 

I wait until Elog with Eqn* ; 

I if Elog then Statement else Statement fi ; 

I while Elog do Statement od ; 

I Statement* 

Expr ::= Eloat \ Var \ Expr + Expr \ ... 

Elog ::= true | false | Elog or Elog | ... | Expr = Expr \ ... 

Eqn ::=dVar/dt = Expr 



The usual syntactic restrictions apply. Only declared variables may be referred 
to in a process body. A variable may be an in variable (imported) or an out 
variable (exported) or a var variable (local) , but not any two simultaneously. A 
variable may be an out variable of only one (and exactly one) process. Only a 
local variable or an exported variable may be the subject of a differential equation 
or an assignment. The special implicitly imported variable t stands for time and 
is governed by the differential equation 



dt /dt = 1 
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A “process” is a piece of syntactic sugar. It serves only to provide a protected 
name space for its local variables. If x is the name of a local variable of p, then 
it accesses a global location p.x but the language syntax protects against access 
via the latter name. A process usually corresponds to a circuit component and its 
variables correspond to voltage or current values at interior points in the circuit. 

A circuit consists of several processes (i.e., components) running in paral- 
lel. Each of these processes writes via its exported variables to possibly many 
listening processes. Communication is 1-many. All writes are non-blocking in the 
sense that the listener need not agree to receive a change of state in its imported 
variable for the change to take place. No cooperation between processes is im- 
plied by communication. The listening process may take no notice of its input. 
As such, the language has similarities to VHDL [1] and Verilog rather than to 
CSP. Indeed, the language is intended to be a cut-down version of the proposed 
VHDL-AMS mixed-signal extension to VHDL currently in hnal draft before the 
IEEE. 

Blocking takes place within each process while it waits for an exit condition 
in a wait statement. This is the same as the until semantics used by Esterel, 
except that local state is not held constant during the wait, but instead develops 
according to the associated differential equations. 

This little language is intended to approximate a subset of the upcoming 
VHDL-AMS standard (see [3] for a proposed denotational semantics). It has 
all the essential features of that language apart from scheduling and the dis- 
tinction that VHDL-AMS makes between quantities and signals. Quantities in 
VHDL-AMS are analogue values and are governed by differential equations. They 
correspond to the variables we have used here. Signals in VHDL-AMS are the 
concept inherited from the old VHDL (digital) standard and are intended to 
represent digital signals: they admit only abrupt changes from one signal level 
to another at predehned discrete times. Therefore they are not governed by dif- 
ferential equations - or rather are governed by an implicit equation dx/dt = 0 
that aserts that they remain constant between discrete time points. Becuase of 
this, their characteristics can by and large be emulated by the use of quantities 
governed by dx/dt = 0. The only area in which their behaviour is not straight- 
forward to emulate is scheduling. 

In VHDL, changes to a signal value can be scheduled for a future time. In the 
language that we have set up here, explicit scheduling is not provided for, but one 
can signal a helper process to begin a wait that terminates at the scheduled time 
and launches an interrupt. The command “x <= 1 after 5” in VHDL is equi- 
valent to the command “x_delay=5 ; x_val=l; x_req=l; wait until x^ck = 
1; x_req = 0” here, when the helper process contains the following code: 

in x_val, x_delay, x_req; 

var val, delay; 

out x_ack; 

wait until x_req =1; /+ wait on schedule attempt +/ 

while x_req = 1 ; do /+ repeat until timed exit +/ 
val = x_val; /+ grab scheduled value +/ 
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delay = x_delay 
x_ack = 1; 
wait until x_req 
wait until delay 
done 
X = val; 



/+ grab scheduled delay +/ 

/+ acknowledge scheduling +/ 

0 ; 

0 or x_req = 1 with d delay /dt = -1; 
/+ exit at scheduled time +/ 

/+ make scheduled assignment +/ 



This helper process has been constructed to also treat signal preemptions cor- 
rectly. Preemption is the classical VHDL semantics for scheduling. A scheduled 
signal change may be preempted by a new scheduling request that has been 
launched after the original request. The new request may assert a new delay or a 
new scheduled value. The second scheduling request, not the hrst, is the one that 
is eventually always honoured - unless it too is preempted before it matures. 
Even if the second request has asked for a longer delay, the second request is 
honoured and the hrst is discarded. 

The absence of an explicit interrupt facility in the language here has made 
the helper process code more complicated than it otherwise need have been. The 
loop in the code is only there to check to see if an exit from the innermost wait 
occurred because of an interrupt or because of the maturation of the timer that 
had been set at entry. If the exit occurs early, then it is because of a rescheduling 
attempt and new timer and scheduled values must be set before reentering the 
wait state. The absence of subroutines and macros is not helpful either. But the 
language has been deliberately constructed in minimalist style in order that it 
lend itself to study the more easily. 

It must be concluded from the above code fragment that explicit scheduling 
is not a necessary part of the language from the point of view of its expressive 
power, but it clearly is necessary from the point of programming convenience. 
However, it is not clear that scheduling remains necessary as a simulation tech- 
nique, once the framework includes analogue quantities governed by differential 
equations. Scheduling a delay is a way of modelling real analogue device per- 
formance characteristics in a digital environment. It is now possible to model the 
performance more exactly, by means of differential equations. So it may not be 
necessary to include scheduling constructs in an analogue hardware description 
language, just as it is not necessary to provide explicit mechanisms for monit- 
ors or spinlocks, or other concepts familiar from parallel computer programming 
languages. 

Another construct missing from the language is a class of nondetermimstic- 
ally delayed and preemptible imperatives. In this language all imperatives apart 
from wait take zero time to execute. But frequently it is necessary to model real 
systems in which a reaction may take some time to trigger, or in which a triggered 
but delayed action may abort if the trigger condition is not maintained for long 
enough. Such a construct is typically encountered in hybrid programming lan- 
guages. If the triggering condition e is maintained for tmin then the action a may 
choose to hre at tmin (but not before). If the condition is maintained for tmax then 
the action must hre (at tmax)- If the condition is maintained for tmin < I < tmax 
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then the action chooses to hre or not nondeterministically at t. Once hred the 
condition e is irrelevant. The construct may be described as 

if e persists to tmin then a before tmax 

and it may be emulated by the following code fragment, supposing access to a 
function or input signal random(t) that can provide a varying boolean feed: 

wait until not e or t >= tmin with . . . 

/+ current equations +/ 

if t >= train then 

wait until not e or random(t) or t >= tmax with ... 

/+ current equations +/ 

if t >= tmax or random (t) then a; fi; 

f i; 

Nondetermism is not neccessary for a valid implementation, but it would be more 
satisfactory in practice. In any case, the code fragment shows that imperative 
actions that possess a duration, and even those that may be preempted, may be 
constructed from the toy language constructs given here. 

4 Semantic Domains 

The basic domain in which all quantities will take their values is the reals, aug- 
mented with T and T. 

Statements of the hardware description language will have the semantics of 
a dependent path. A path is a piecewise smooth trajectory through state space 
with support a closed interval of time. A path assigns two real ordinates to every 
point of its support, the left and right values, representing limits from the right 
and left. The paths followed by programs are dependent on the paths followed 
by the external variables: 

Semantics = Path Path 

Path = y ([a, &] — ;> (S'tate, S'tate)) X T} U {T} 

State =/c?— ^-IfU{T,T} 

Time = [0, oo) 

States assign an augmented real value to every identiher. T denotes the undehned 
value, to be thought of as meaning “every and any real value is possible here,” 
and T denotes an error: “no real value is possible here.” A state will be said to 
be “in error” if any of its values is equal to T, but we will not identify the error 
states together. 

The topology of the reals is extended to -If U {T, T} by letting the neigbour- 
hoods of T be all sets that include T. Thus T is uniquely the unique limit of any 
sequence that eventually always is T, 

Similarly, the neighbourhoods of T are all sets that include T, and T is 
uniquely the unique limit of any sequence that eventually always is T. 
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The notations y , t/"*" denote the left and right values at t in the support of 
a path / with in which we will say that f t x = y = {y ~ , t/"*"). That is 

7Ti(/ t) X = y- 
7T2(/ t) X = y+ 

{y~ ,y"^) = y 



and we use for the average value, and write f°tx = y°,f t x = y , 
f + t X = y+: 



y +y'^ 

2 

lim f^TX 

T^t~ 

lim f°TX 

r— )-t + 

These domains all have all hnite limits: State is the full domain of total functions, 
with the pointwise rehnement ordering; Path is a domain of partial functions - 
each function has a support on which it is dehned. A support is a contiguous 
interval [a, h] of the time line 



y = 

y+ = 



Time = [0, oo) 

A path either terminates (signihed by a -y/ symbol) or is prepared to continue 
(signihed by T). In the latter case it is a partial path. A terminating path rehnes 
another terminating path if its support is exactly the same and it rehnes the hrst 
pointwise on the support. Otherwise one path rehnes another only if the latter 
is partial and has an equal or shorter support, and there is pointwise rehnement 
on the common initial segment of the supports. 

We dehne the functions initial and final on paths, which obtain the initial and 
hnal state and time: 

Definition 1. 



initial f = {f to, to) 
final f = (/+ ti,ti) 



where dom / = [to,Ii] 

Note that the pointwise meet frig of two continuous paths / and g is continuous. 
If the limit / tp g ti as the ti approach t is not a real number, then it must be 
/ t = T or T and/or ^ t = T or T. In this case the limit is eventually always 
achieved by the sequence, and so (/ □ g) ti will either eventually always be / ti, 
or eventually always be g ti . 

This definition is required for the semantics of parallelism. Putting two pro- 
cesses in parallel signifies calculating the path f r\ g. If the paths disagree with 
real values at a point t, then the parallel composition frig has an error at t. I.e. 
if / ^ ^ and / t, g t L, then {f n g) t = T . 
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Returning now to the language L: to get the semantics [si;s2 of a pair of 
statements si, S2 in sequence, we will eventually put the path that we get from 
[si] in sequence with the path we get from [52], given the hnal state of the hrst 
path as initial. 

Assignment statements will have a semantics as a special sort of path. They 
take no time to execute, and as such are sited at particular points in time, at 
which they cause a discontinuous change in state. They can be regarded as triples 
of time points and pairs of before/after states: 

Discontinuity C Path 
Discontinuity = {Time, State, State) 

The representation of (t, (T_ , (T_|_) in Path is ({(t, ((T_, (T_|_))}, -y/). 

A discontinuity (t, (T_ , (T_|_) can abut a path with support [to,f] terminating in 
a right state (T_ and the combination produces a path with the same support but 
terminating in a right state (T_|_. Two triples based at the same time point can 
likewise be combined. 

5 Logic 

We will evaluate logical propositions in a truth domain 

n = 



consisting of all the real numbers and ±00, not only the boolean {T,F}. This is 
appropriate to problems that deal with continuously varying values. 

In this domain the more positive a result is, the more false it is, and the 
more negative a result is, the more true it is! To get a grip on this, we dehne the 
positive and negative parts of this intepretation of logic separately. Both parts 
evaluate into and then the whole evaluation is the difference: 

[— ] :: Elog Q 
[x] =[*]+- [*]- 

In the positive interpretation we get the degree of falsity of a condition, and in 
the negative interpretation we get the degree of truth. 



Falsity 


Truth 


[*andt/]+ = max [*]+ [y] + 


[x and y]- = min [x] _ [y] _ 


[xory]+ = min [*]+ [t/] + 


[x or j/] _ = max [x] _ [y] _ 


[not *]_!_ = [x]- 


[not x]- = [*]_!_ 


[true]_|_ = 0 


[true]_ = +00 


[false]_|_ = +00 


[false]- = 0 


[m < r’]-l- = max (u — v) 0 


[m < r>]_ = max (r> — m) 0 


[m = r>]_l_ = abs(« — v) 


0 

II 

1 

II 




320 



P.T. Breuer, N. Martinez Madrid, and C. Delgado Kloos 



Universal and existential quantification will be implemented via sup and inf. 



Falsity 



[V*|p.g]+ = sup{min [p]_ [g] + } [V*|p.g]_ 
[3x\p.q\+ = inf{max [p]+ [g] + } [3x\p.q]_ 



Truth 



inf {max [p] + M-} 
supjmin [p]_ [g]-} 



It is easy to prove by induction that the positive interpretation is positive 
when the negative interpretation is zero and vice versa: 

Proposition 2. 



[*]+ > 0 — 7 > [*]_ = 0 
[*]_ > 0 — 7 > [*]+ = 0 

so that only one of [*] + , [*]_ is ever positive at a time. Hence [*]_|_ = max [x] 0 
IS the positive part of [x] and [*]_ = — min [x] 0 is the negative part of [x] . 

This means that negation is [not*] = — [*] and therefore that implication is 
[x ^y]= min [y] (-[*]). 

One of the useful aspects of this interpretation is that it is, excluding the 
quantiher interpretation and adding the point at inhnity, continuous and piece- 
wise differentiable in any free variables. 

If quantihers are taken into account, then the interpretation is still continuous 
(and piecewise smooth) provided that the atomic statements are continuous (and 
smooth) in all variables simultaneously. For example, consider [Vt|t < t' .t < 1]-1-. 
The interpretation depends on the interpretation of t < t' and t < I, which are 
t' — t and I — t respectively. These interpretations are bi-continuous in t and t' , 
and the interpretation of the quantihed statement is 

supjmin (t — i') (1 — t)} = (1 — i ') /2 

and is continuous (and piecewise smooth) as a result. It is possible to use this 
interpretation with generic numerical methods that hnd the zeros of continuous 
and piecewise differentiable functions. 

If a result is classically true, then it evaluates to a negative number here, and 
if it is classically false, then it evaluates to a positive number: 

Proposition 3. 



X IS true —;>[*]< 0 
X IS false —;>[*]> 0 

The only possibility of getting 0 simultaneously both from the negative and pos- 
itive evaluations is when the state lies exactly on the boundary of an underlying 
inequality or equality. A modihcation to the interpretation of equality could make 
the conditions mutually exclusive, but at the cost of introducing discontinuities 
on a set of measure zero. 
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To this substructure we add the notation used earlier for left and right limits. 
A proposition p{t) that depends on time t and which evaluates into this real- 
valued truth domain may take different values close to t on its left and on its 
right on any particular path: 



[p] (f ) = b('r)] 

[p]+(t) = lim b('r)] 

r— + 

In particular, the idea that a path is continuous at a point t in variables x may 
be expressed as 



3y.[x = y] A[x = y] + 



or 



\3y.x = y h x~^ = y\ = —abs[x — x~^) /2 > 0 

Proposition 4. [3y.{x~ = y A x~^ = v)] evaluates as false (t.e., <0^ precisely 
at the points t on a path where it is discontinuous in x. 

When dealing with predicates p that vary continuously along a path, one 
derived property that is of particular interest is the fact that the time domain 
over which they take at least a given truth value extends further to the right than 
the point of hrst evaluation: 

b(t)] > a ^ 3St > O.Vr £ [t,t + (it).b('’‘)] > « 

(and equally for \p{t)] < a). This property of predicates will be called sharpness: 

Definition 5. A property p is sharp on a path /, if (1) whenever it has a truth 
value more than a at t, it has at least the value a through a small interval to the 
right of t, (2) whenever it has a truth value less than a at t, it has at most that 
value through a small interval to the right of t. 

A sharp property is one which holds in intervals of the shape [a, h). I.e., anywhere 
that the property is true to a certain degree, it is true to that degree also for a 
short interval to the right. And similarly for falsity. 

This is a weaker requirement than continuity. For example, the function 

f 1, t e [2n, 2n + 1) 

\ 0, t e [2n -f 1, 2n + 2) 



is sharp but not continuous. 



Proposition 6. All continuous properties are sharp. 
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Sharpness for predicates generalizes a classical property of boolean predicates 
over a finite state machine with real world inputs: a boolean predicate is sharp 
if whenever it is true, it stays true for a while, and whenever it is false, it stays 
false for a while. I.e. 



p{t) 3St > O.'^T & [t,t + St).p{T) 

-^p{t) — 7- 3St > O.Vr e [t, t + (5t).-ip(r) 

This classical property reflects the behaviour of a hnite state machine given time 
varying inputs. The machine cannot examine its inputs with inhnite precision 
because looking at decimals takes instruction cycles, and arbitrarily many may 
be required in principle to distinguish between 0.99999. . .9 and 0.1000. . .1. So 
a computer that guarantees to progress must, in a given state, stop looking at 
decimals somewhere, and then is insensitive to variations of hner precision. So 
the computer holds the outputs it controls momentarily constant in the face of 
such variations. Putting it another way, once it is in a given state it cannot leave 
that state for some small positive interval of time. 

If the outputs of the computer are a combination of its internal state and 
continuously varying external factors, however, then all that can be said is that 
the outputs are locally continuous. They consist of picewise continuous fragments. 
The intervals of continuity are precisely closed-open intervals of the shape [a,h). 
At the right hand edge of the intervals the computer “wakes up” to the fact that 
something has changed and changes its state abruptly, leading to a discontinuous 
change in the output. The edges of the intervals do not cluster anywhere provided 
that the computer never goes into an inhnite loop. 

Sharpness of predicates captures the latter state of affairs exactly: Note hrst 
that any function that is piecewise continuous with the pieces consisting of a 
sequence of abutting closed-open intervals [ap ai_|_i) that cover the whole real 
line is sharp (by the dehnition of sharp) . 

Conversely, we can show that sharp as stated here is equivalent to the notion 
of contmmty when the domain space has the topology generated by the base of 
neighbourhoods oft. This is because, in that topology, limits are necesarily 

downgoing limits ti t~ . So continuity over that topology means exactly that a 

function turns downgoing limits into limits. It follows easily from the dehnition 
of sharp that a sharp function turns downgoing limits into limits. Hence: 



Proposition 7. A sharp function is precisely a continuous function when the 
time domain is given the topology in which the closed-open intervals [t,t') ore a 
base of neighbourhoods for t. 



Continuity over this topology means preservation of downgoing limits. Clearly 
both classically continuous functions and functions piecewise continuous over 
abutting intervals [ap ai_|_i) preserve these limits. There are no more candidate 
functions whose points of discontinuity do not cluster to the right. 
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6 Semantic Functions 



The semantics described loosely earlier may be formalized as follows. Let be 
the semantic operator that combines abutting paths (and discontinuities), the hrst 
one of which is terminating: 



(; ) :: Path Path Path 

fa,b'j9b,c — /"*" b g b 

where h^ t = | A 

\g i, 
h~ t\ \ g~^ b 
h+ t \ ~ \ f- b 



t e [a, b) 
t e {b, c] 

t = b 



( 10 ) 



The termination attribute of the result is summarized in the following table: 



(;) 


V -L 


V 


V -b 


T 


— 



The semantic function will evaluate statements to paths on the variables in 
scope predicated on the path of the external variables. 



[— ] :: Statement Semantics (11) 

Let /a be the least hxpoint operator in the Semantics domain. I.e. it hnds the 
least invariant path transformer: 

/a :: [[Path Path) Path Path) Path Path 
/a F fo = T U T [constl.)fo U F^[constl.)fo U . . . 

and identify the trajectories that a system dv/dt = e may take given external 
trajectories / via the following predicate. 

Definition 8. Let 0[f,g,b,v,e) be the predicate (“oracle”) that says that 

— gto^i is a path satisfying the equations dv/dt = e on the interval (to,^i) 
starting with initial state and time [cro,to) = initial /; 

— g\—= f\— (equality on variables not among v) on this interval; 

^ !t\v ^ f\v (rehnemet on variables among v) on this interval; 

— ti > to is the hrst point in time on the path at which the conditions b become 
true (i.e. f t\ & < 0). 

That is: 



0[f,g,b,v,e) 



ti = sup{t| t > to, Vr e {to,t). lim ^ ^ ^ — 9_LJi g t b >0} 

r'—^T T — T 
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The important property of this predicate (for a particular set of equations dr> /dt = 
e and condition h) is that, given the classical uniform topology on the function 
spaces: 



1. the path described by the variables v is uniquely and continuously and caus- 
ally determined by the starting conditions of the variables v in (Tq at to in / 
plus the path in / taken by the variables other than v referenced by e (i.e., 
imported external variables); 

2. the stopping time is uniquely and continuously determined by the initial 
condition of the variables v in (Tq at to in / plus the path in / taken by the 
variables other than v referenced by h and e over 



Causality here means the absence of dependence on the state at future times. 
This type of dependence may not be true for all sets of differential equations, 
but it will be true for those that we treat. 

Using the nonstandard interpretation of logic set out in Section 5, these con- 
ditions mean precisely that 



1. [O{to, ti, (To, /, h, V, e)] is a real continuuous function of its real and functional 
parameters, using the appropriate topology (real and uniform, respectively). 

2. [O{to,tij^o, f,b,v,e)] is dependent only on the portion of / restricted to 



The dehned action of a state (T on a variable v has been taken to extend naturally 
to expressions e and conditions h in the above. That is, / t e is the evaluation 
of the state cr = f t on the expression e in the natural way (interpreting + as 
addition, and so on, and taking arithmetic operations and comparators to be 
strict; T + 5.4321 = T, etc.), and f t h is the evaluation of the state cr = f t on 
the condition h using the real- valued logic of Section 5. Then we may express the 
semantics of the language as follows: 
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[Pl ■ ■ -Pn]fo 
[p (. . .){ var };]/o 

[si . . . Sn]/o 



[v = e;]/o 



[wait until h with dr>/dt = e;]/o 



[pi]fo U . . . U [p„]/o 
[s{p.vi/vi, . . .,p.v„/v„)]fo 

[si]/o ) • • • ) [Sn]/n — 1 

where /,• = (f,-, cr,-, ±); /i_i|[t,_oo) 

and (ai,ti) = final {[si]fi_i) 

fo © {(fo,cro,cr[,)} 

where (t^v = (Tq © {(t^, (Toe)} 
and {(To, to) = initial fo 

fi 

where O{fo,fi,b, v, e) 



[if h then si else S 2 fi;]/o = [si]/o, if [&](cro,fo) < 0 

[s2]/o, if [&](o-o,fo) > 0 
and {(To, to) = initial fo 
[while & do s od;]/o = pF fo 

where ©7T fo = (to, (To, (To), if [&]((To,fo) > 0 
(tt o [s])/o, otherwise 
where {(To, to) = initial fo 

This semantics uses the facts that 



1. paths are dehned to give values (possibly T) to all variables. I.e. paths are 
total functions; 

2. paths give the value T to variables which they have not assigned a value to, 
which allows us to put paths in parallel via meet of paths. 

If a process p can engage in a path / and it is compatible with a path g that 
process q can engage in in the sense that there is a common rehnement h □ f,g, 
then h will be a possible path for p in parallel with q. 

A process p has the semantics of the code it contains, after replacing references 
to the local variables v with references to unique globals p.v. 

Sequences of code have the semantics of the set of all possible valid juxtapos- 
itions of paths taken from each set in sequence. 

One of the consequences of this semantics is that all the possible paths in a 
circuit must terminate if all the possible paths in any circuit component (i.e., 
process) terminate. The circuit dies if any of its components dies. 

Another consequence is that if a process can loop inhnitely often without 
making inhnite time progress, then the path that results is undehned after the 
limit time that the computation approaches. 



7 Approximation 

At this point we will construct a second semantics for the language, using a time 
grid with intervals of no greater size than a hxed constant d 
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Tijnc^ — ... t— r , , . . . 

with ti < < ti + S, ti ±cxD as i ±cxd. 

The “paths” / that we will now consider will still be functions through the 
continuous time space but they will be completely determined by the points on 
Times] between any two such points they will constantly take the right value of 
the left-hand point, which will be the left value of the right hand point: 

Paths = {/ e Path \ Vt, i. t £ {U, U+i) f + t = f~ t = f + U = f~ U+i} 

So these paths are square-shaped curves. They inherit the rehnement ordering 
on Path. 

The only points at which discontinuities may be dehned are the gridpoints: 
Discontinmtys = {Times, State, State) 
and in general the semantics of statements will be a set of a (i-grid paths: 

Semanticss C {Paths} 

Likewise we use a version of the oracle predicate that delivers a result that is 
based on the time grid. It gives an exit time t exactly on the last grid point 
immediately before the condition h is hrst satished on the path /. 

Os{tio,ti^,(To,f,b,v,e) 

^ff 

f(tio) = 0-0, L'l = sup{L-| i > io, Vi e [io, i). -^ " = ft, e, f t, b > 0} 

Then the semantic function may be dehned with only minor alterations with 
respect to that given earlier: 

[pi • • -Pn]s = \pi]s U . . . U \p„]s 
[p (. . .){ var s };]^ = [.s{p.vi/vi, . . .,p.v„/v„)]s 

• • • i>n](5 — {/l 7 • • • 7 /n I fl ^ 7 • • • 7 /n ^ [^n]<5 

[v = e;]^ = {{ti,a, a')\ a'v = a® {(t^, cre)}}t 
[wait until b with dt;/dt = e;]^ = Os{ti,tj, f ti, f, b, v, e)}t 

[if b then si else S2 f i;]<5 = [ft„t, e [si]<5|/ ti b < 0 }^ 

n {gt„t, e [s2]<5|6' tj b < 0}t 
[while & do s od;]^ = psF 
where Ttt = {{ti,cr,cr) \ cr b > 0}^ 

n {ft,,t,',9t,,tk \ f ti b < £ [s]s,9tj,tk £ Trjt 

There are several questions of interest here. Is every path / in the standard 
semantics of a statement the classical limit of paths in the semantics derived 
from sufficently small d-grids? Can one chose arbitrary d-grids? If a process is 
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deterministic in the sense that its standard semantics yields a principal set {/}^ 
once all inputs have been dehned, then is it the case that / is the unique classical 
limit of paths The hrst of these questions is easy to decide (see the proposition 
that follows), but hrst a counter-example: if a path varies gently but exits from 
the gentle variation at a given threshold and jumps suddenly to a new value, such 
as for example, in 



X = 0, t < 0 

d* — f 0 < t < 1 

“ 1^ 0, t > 1 

X = 0, t > 1 

(a single sawtooth), then on a grid in which the last gridpoint before t = 1 is 
ti < I < ti+i, the approximation inevitably must make the jump to zero at ti 
and not t, and hence will introuce an error of size approximately 1 throughout 
the small interval {ti, 1). This means that convergence cannot be uniform, but it 
will be pointwise. Moreover, convergence will be uniform “almost everywhere” 
in the sense that it will be uniform excluding only neighbourhoods of a set of 
nowhere dense points. 

A suitable program to generates the sawtooth path above is 



wait until t>0 with dx/dt = 1; 

X = 0 ; 

wait until false with dx/dt = 0; 

(for example). 



Proposition 9. Every proper finite initial segment of a path f G [s] is the 
(pointwise) limit of paths f^ G [s]^ as S ^ 0 for a suitable choice of 5-grids. 



Proof: WLOG , assume that / has inhnite time support. Given /, choose grids 
which include at least all the time points ti j at which the processes pj that 
generated / entered and left wait statements. Let the grid points of be (ordered 
as) tf.. Since / was generated by processes that made inhnite progress in time, 
these points do not cluster anywhere. / is smooth at all points not amongst the 
tij. 

For any choice of the f^ given suitable grids as described, let t be the earliest 
time point at which f^{t) -fit f{f) as (i — ;> 0. We can include control path markers 
in the state to ensure that this is also the hrst point t at which the control path 
followed by the f^ deviates from the control path followed by / in any of the pj . 

Suppose hrst that t falls at a point when / is strictly within all wait statements 
of nonzero duration (i.e., t is strictly between the tf. that are Ljs). Say ti -j < t < 
Then by hypothesis the states s^ = f^ ti^ j on entry to the waits at ti^ j 
(being earlier) converge to the states s = f ti^ j as (i — ;> 0. Also by hypothesis 
the path fragments fi,. between entry to the waits and t are the pointwise 
limits of the corresponding f^ fragments. Since f is smooth at t, the convergence 




328 



P.T. Breuer, N. Martinez Madrid, and C. Delgado Kloos 



is to a continuous (indeed, smooth) function on the closed interval, and hence is 
uniform. 

The approximating oracle function determines the approximating exit time r' 
from each wait statement in each process pj continuously in terms of the state 
at entry to the wait at r and the path segment ff. followed by the imported 
variables Vj during the wait. As the state at r tends to the state at ti -j in pj and 
the path fragments of converge uniformly to the fragments so the exit 
time r' tends to in each process pj . In particular it tends to some value 

greater than t + e, for suitable choice of e > 0. Since the development of the paths 
is governed by differential equations on the intervals tq , t + e, the state at t in 
is the limit of the states to the left on the path. These tend to the states on /, 
and therefore t tends to f;t. 

Now consider the case when t falls at the exit point tq+i j' of a wait statement 
in process pj. Then the processes pj that generated / may have engaged in 
computation to determine the state at that point. They could not have entered 
an inhnite loop or they would not have progressed beyond t, and by hypothesis, 
they have. So the state of / at t is the result of hnitely many computations in 
zero time since exiting a wait statement in at least one contributing process. The 
argument of the last paragraph may be used to conclude that the state at the 
exit from the wait statement (if any) depended continuously on the path up till 
then (if there is no preceding wait, then we are at the beginning of the program 
and the state is hxed). Now, a hnite number of computations can only examine 
the state variables at exit from the wait to a certain hxed accuracy, and the 
computation thereafter will be invariant once the state is determined to within 
that accuracy. Therefore taking a sufhciently small S that is sufhciently close 
that the variables on exit from the wait are within the accuracy examined, the 
state at t will be determined as closely as may be required. 

The result in the case when / does not have inhnite time support comes by 
choosing an arbitrary proper initial support [0,t] (on which / is generated by at 
most a hnite number of computations) and choosing the grid as before on this 
interval and extending it arbitrarily to the right and left. 

□ 

On any hnite closed set of intervals that does not include the points ti j at which 
the exact path enters and exits wait statements, the convergence, being pointwise 
and to a piecewise continuous limit, is uniform. 

This result is adequate, but it does depend on the good behaviour of the 
differential equations involved. Can the d-grids be chosen arbitrarily? If the grid 
is constructed not with respect to /, then it can only be guaranteed to have 
points that fall close to but not on the entries and exits from wait statements of 
the processes that construct /. 

In these circumstances, the semantics of the oracle predicate 0 says that 
wait statements for the approximations will exit at the grid point immediately 
before the exit condition becomes true. For small enough d this means only a small 
disturbance with respect to /. We can argue as in the proof of the proposition 
above that the processes that generated / are computational and therefore cannot 
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use infinite precision, and as soon as the disturbances generated are below this 
level of precision, then the control flow that generates will be the same as the 
control flow that generates / over any proper hnite initial segment of /. So, yes. 

Proposition 10. Every proper finite initial segment of a path f G [s] is the 
(uniform) limit of paths f^ G [s]^ as S ^ 0 for arbitrary choice of small enough 
S -grids. 

It cannot in general be known in advance how small is “small enough” for d. That 
means that we cannot know beforehand what degree of numerical precision is 
required to compute a path accurately over a given time interval. The calculation 
may have to be done again with greater precision. It is not even clear how to 
check that the computed path is accurate. 

The arguments in the preceding propositions showed that control flow is de- 
termined in a hne enough d-grid. In conjunction with the determinacy of the 
exit state and time in a wait statement, this shows that the limits refered to are 
unique. 

Proposition 11. If there is a principal path {/}^ = [s] then the approxima- 
tions f G [s]^ converge to it uniguely (and uniformly on any finite proper initial 
segment of its support). 

The existence of a limit for the approximating paths in [s]^ seems to imply that 
the limit is in [s] . 

It also seems to be the case that the language semantics is deterministically 
functional, despite being couched nondetermistically. In a future work, it should 
be possible to demonstrate that the parallelism in this language is benign. Only 
one process writes to each shared variable. This does not on its own mean that 
the value is determined because data dependency loops a = b;b = a may exist. 
However, each approximating semantics for the language inserts an implicit delay 
in the evaluation of external variables during the solution of differential equations, 
because they use a variant of the formula 

f(t + S) = f{t) + ^S 

in which only old values are used to calculate the new. So the approximating 
semantics must be deterministic. 

8 Experiments 

An interpreter has been built for this language in the functional language Gofer 
[2]. The interpreter works with a hxed d-grid. 

In experiments we have found that it is not very important to have a small d 
so long as the solutions to the differential equation sets are relatively smooth (in 
the sense of only slowly changing, i.e. having a small second derivative). Once at 
least quadratic approximation is used in the numerical approximation routines 
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that trace the state space trajectories, then the deviation “in flight” from the 
correct trajectory is very small. Of great importance, however, is the accurate 
calculation of the entry and exit times to the wait statements. A small inaccuracy 
can have very large effects, and it has been found necessary to interpolate one 
extra grid point dynamically at the exit of each wait, using a combination of 
Newton-Raphson prediction and interval-halving to determine the time of exit 
accurately. 

For example, consider an analogue circuit that simulates the flight s{t) of a 
bouncing ball. It may be written as the following program: 

ball (out: s=l, v=0) { 
var g = 9.8, a = 0.1; 
while true do 

wait until s<0 with dv/dt = ~g ~ sign(v) + a + 
ds/dt = v; 

V = -v; 

s = 0 ; 
od; 

} 

This program essentially introduces the flight equation under gravity 

d^s 

d^ = -" 

with a modihcation ±av^ that allows for air resistance. The air resistance acts 
against the direction of motion and is proportional to the square of the velocity. 
Whenever the ball bounces on the ground, the velocity reverses exactly. The 
bounce is perfectly elastic. 

The in-flight trajectory deviates slightly but smoothly from a perfect para- 
bola. One may calculate that the ball should lose a fraction ah of its energy hg 
to air resistance on each journey to ground from height /i, and approximately 
the same loss should occur again on the way up from ground to the peak of the 
next bounce. 

Inaccuracies in the calculation of the trajectory are cumulative. For example, 
using a tangential linear approximation at each step of the numerical integration 
between grid points on the way up in the bounce will put the ball signihcantly 
higher than it should be (the curve is convex) at each step but get the velocity 
approximately right, with the net result that the ball loses less energy than it 
should on the way up. The total error is proportional to the grid size. A grid of 
0.08 may produce about a 30% overbounce. But almost all this numerical error 
is mended by using quadratic approximation instead of linear approximation at 
each step. 

Of more signihcance is the error introduced by being late or early in detecting 
the exact point at which the ball hits the floor. The ball is travelling fast there 
so a small time error causes a large velocity error. The program assumes that 
the timing of the exit from the wait statement is exactly right and just reflects 
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the velocity (“v = — v;”), while moving the ball through space (“s = 0;”). But, 
given that the exit is taken too early and that the ball is still above the floor then, 
this lowers the potential energy of the ball without increasing the kinetic energy 
to compensate. Because the velocity is high, the ball may be far from the floor 
when the magic move occurs, which means a large change in the energy. The net 
result is a large underbounce from being early on the exit from the wait (and a 
large overbounce from being late) . 

To reduce the timing error, we have successfully used Newton-Raphson style 
iteration to locate the onset of the condition s < 0. We use the nonstandard 
interpretation of [s < O] from Section 5 to get a real- valued “degree of falsity” 
for the condition, and follow the curve of falsity against time to its nearest next 
zero. The curve is piecewise smooth so it is amenable to this method. Once 
located, the time point is added to the current grid. This seems to result in very 
satisfactory precision with very little extra calculation. The state space trajectory 
is shown in Fig. 4. 





Fig. 4. The velocity v and height above ground s against time of a ball boun- 
cing elastically under gravity with air resistance, as described by the analogue 
process ball. Initial height s = 1.0. 



9 Conclusion 

A minimal analogue process description language has been designed. It is a hy- 
brid programming language that is capable of describing all analogue circuits that 
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are defined in terms of state-switched differential equations. The language has 
been given a well dehned classical semantics in which every process corresponds 
to a set of possible piecewise smooth paths through state space. Approximate 
semantics have been dehned, each based on a particular discrete grid of time 
points. Each approximate semantics is computable exactly on a standard com- 
puter. It has been shown that, under reasonable assumptions on the behaviour of 
the differential equations involved, the approximate trajectories computed for the 
system converge uniformly to the ideal trajectories on any proper initial segment. 

The discussion has been motivated by the desire to understand hrst if and then 
why the description of the classical operational amplifer as a simple computational 
input-output causal reactive system is impossible. 
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Abstract. In order to realise digital systems that operate at high 
speeds or that have very low power consumptions, it is necessary 
to work directly at the analog level of abstraction, that is, in terms 
of analog electronic components such as resistors and transistors. 
Although the external behaviour of such circuits can be described 
digitally, their internal operation can only be explained by working 
at the analog level and by taking account of both voltages and 
currents. 

This chapter describes how existing methods of specification and 
formal verification of digital systems can be extended so as to en- 
compass such analog designs in a fully rigorous manner. 



1 Introduction 

The usual, idealised approach to digital design is summarised in Fig. 1: 

— The starting point is a requirements speeifieation. This can be a pred- 
icate, req, describing the desired relation between the signals at the 
terminals of the desired implementation. 

— The process of synthesis (typically informal) leads to an implementa- 
tion, often presented in the form of a gate schematic. 

— The process of behavioural extraetion yields a derived specification, imp, 
of the behaviour of the implementation. 

— The process of verifieation compares the predicates req and imp to 
determine if the implementation is eorreet. 

As a very simple example, consider the design of an Exclusive-Or gate. The 
predicate req will be 



req-.W x B prop 

req {{x,y),z) = {z = {x ® y)) 

(where B A {t,f} is the type of ideal, steady-state digital signals and prop 
is the type of propositional logic truth values). 

— Synthesis might yield the gate schematic shown in Fig. 2. 
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Fig. 1. Idealised digital design process 



— Behavioural extraction would then yield the derived specification 

imp: X B prop 

imp {{x,y),z) = 

3u, V, w: B. 

or ((x,y),u) A and ((x,y),v)A 
not (v,w) A and {(u,w),z) 

— The verification condition is 

\/x,y,z:M. imp {{x,y),z) =1> req {{x,y),z) 
or, in higher-order form, imp □ req. 




Fig. 2. A typical implementation of Exclusive-Or 

It is the task of the fabricator of the devices (in general, a different 
person than the user of the devices) to ensure that the devices function in 
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such a way as to guarantee that when the analog signals at their terminals 
are viewed through the appropriate abstraction function, they do, in fact, 
satisfy the predicate imp. Doing this involves ensuring that the functions 
and predicates round the square in this diagram commute (that is, that 
IMP' □ IMP). In general, this is guaranteed only provided the digital 
circuit obeys certain (technology-specific) design rules. 

Limitations There are two major limitations with this approach to speci- 
fication and verification. Firstly, compositionality has not been considered, 
and this leads to obvious anomolies. For example, the back-to-front gate 
schematic shown in Fig. 3 equally satisfies the above verification condition. 
Secondly, and of greater significance, this approach does not cover optimised 




Fig. 3. An unsatisfactory implementation of Exclusive-Or 



implementations. In practice, designers usually need to optimise an imple- 
mentation with respect to speed, power consumption or area. For instance, 
an optimised implementation of req might instead use the circuit shown in 
Fig. 4. Such an implementation, which makes use of pass-transistor logie, 
is very fast, consumes very little power and occupies a very small area. 

Reasoning about the behaviour of implementations involving pass-transistor 
logic, or even those involving tristate logie (as widely used in implement- 
ing bidirectional busses) is much more difficult than reasoning about the 
properties of pure logic circuitry for three reasons: 

— The property of direetionality is not present. Causality in such imple- 
mentations is adireetional and there is no notion of input and output. 

— To understand the operation of the circuit, both voltages and eurrents 
need to be considered. 

— There is no a priori reason why any of the signals (including the ‘output’ 
signal) should be at well-defined digital levels. 

Informally, the behaviour of implementations involving tristate gates or 
wired- or gates is determined by “summing the drives” and then using a 
“resolution function” to convert the summed drive to a logic level. For 
example, the combination of drives of a weak pullup in parallel with a strong 
pulldown and three high impedanees would resolve to a low logic level. 








336 



K. Hanna 




Fig. 4. An optimised implementation of Exclusive-Or, making use of pass- 
transistor logic. 



Conclusion The design process for analog-level implementations of this 
general kind is more accurately described by Fig. 5 (or by an amalgam 
of both diagrams if the implementation involves both analog and digital 
components) : 

— As before, the starting point is a requirements specification, req, de- 
scribed at the digital level of abstraction. 

— Synthesis yields an analog circuit (maybe with tristate gates, pullup 
resistors and pass transistors present). 

— Behavioural extraction yields a derived specification, IMP, involving 
both voltages and currents, and described at the analog level of ab- 
straction. 

— Behavioural abstraction maps the digital-level requirements specifica- 
tion, req, to a corresponding specification, REQ, described at the analog 
level. 

— The verification condition is now IMP □ REQ. 

In this paper, we show how the well-established techniques of specification 
and verification at the digital level can be extended so as to encompass the 
analog level as well, and how behaviours at the digital and analog levels 
can be related. The approach inherits and builds upon concepts introduced 
in [1,2, 4, 6, 8, 9, 10] but goes significantly further in generality and rigour. 
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Fig. 5. Design process, for analog implementations 



2 Analog specifications 

We take a behavioural specification for an analog component as being a 
predicate on the signals, both voltage and current, at its terminals. Such 
a specification can be either static or dynamic according as to whether 
the signals are steady-state levels or time-varying waveforms. In general, 
a relational specification does not uniquely describe a device’s observable 
behaviour; rather, it defines a subset of possible behaviours. 

Notation We treat analog physical quantities as reals (K). For clarity, we 
introduce type synonyms of V (for voltages) and I (for currents), both 
equivalent to K. 

2.1 Examples (static specifications) 

We illustrate these concepts with a series of examples. 

The specification, voltage, for an ideal voltage source is defined as 

voltage: V ^ (V x I) ^ prop 
voltage vi {v,i) = {v = vi) 

The specification of the dual concept, an ideal current source, is 

current: I ^ (V x I) ^ prop 
current ii (v,i) = {i = i\) 

The specification, short, for a short circuit is short = voltage 0, and the 
specification for the dual concept, an open circuit, is open = current 0. 
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The specification, res, for an ideal resistor, is 

res : K (y X 7) prop 
res r (v,i) = (v = i X r) 

The specification, Res, for an aetual resistor (which is specified to within 
a given tolerance) is 

Res: K K (y X 7) prop 

res ri tol {v, i) = 3r: K.(|r — ri \ <tol x ri) A (res r (v,i)) 

The specification, diode, for an ideal junetion diode, is given by the well- 
known ideal diode equation, cast in relational form 

diode: V I ^ (V x I) ^ prop 

diode vth isat (v,i) = (i = isat x exp(v/vth ~ 1)) 

A looser version of the same specification for describing the behaviour of 
an aetual junetion diode is 

Diode ^ P ^ (V X I) ^ prop 
Diode (vthi,vth 2 ) (isatiDsan) (v,i) = 

Pat- 

('^thl ^ '^th ^ '^th 2 ^ A (isatl ^ Pat ^ Pat2^t\ 
diode Vth Pat (v,i) 

Here, the arguments Vthi,vth2 and isati,Pat2 define the lower and upper 
bounds for the parameters Vth and isat of an ideal diode. 

2.2 Examples (dynamic specifications) 

The same approach extends naturally to cover the specification of dynamic 
behaviours. To express time, we introduce a type synonym, T, (for time), 
equivalent to K, and we treat waveforms (that is, time-varying signals) as 
functions defined on T. 

Any static specification may be lifted to a dynamic one simply by assert- 
ing that the static specification holds at all instants of time. For instance, 
the ideal diode specification defined above can be lifted to a dynamic spec- 
ification, diode' , defined by 

diode': V ^ I ^ (T ^ V) x (T ^ I) ^ prop 
diode' Vth Pat (u,j) = Vt: T. diode Vth isat (u t) (j t) 

Many devices have behavioural specifications that are inherently dy- 
namic, typically taking the form of a differential equation. To describe such 
equations, we use the differentiation function, 77: (K K) (K K) that 
takes a differentiable function to its differential. For instance, the value of 
D sin is eos. 
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An example of a device with an inherently dynamic specificaton is an 
ideal capacitor. Its specification, cap, is parameterised by its capacitance: 

cap: W ^ {T ^ V) X {T ^ I) ^ prop 
cap c (u, j) =\/t:T. j t = c x (D u) t 

(that is, the instantaneous current is equal to the capacitance times the 
rate of change of voltage) . 



2.3 Operations on specifications 

Many useful operations can be defined on specifications. Some apply to any 
kind of specification, others are specific to specifications involving voltages 
and currents. 

Lattice operations We overload the usual propositional connectives (A, V, 
=1-, etc.) and the two propositional truth values (T and F) so that they also 
apply to predicates (ie, specifications) of any arity. Thus, we have, for any 
type S and for any predicates p and q defined on S, the following operations: 

{p A q) X = p X A q X 
{pV q) X = p X \f q X 
{p=>q)x=px=>qx 

J x = J 
f x = f 

We also define a partial ordering on predicates by 

(□): (S prop) X (S ^ prop) prop 
p □ g = Va;: S. p x q x 

(Read p □ g as ‘p is at least as strong a specification as g.) 

Composition of components Given a set of components, the voltages and 
currents at their terminals will, by definition, obey the individual specifi- 
cations of each one separately and so will obey the conjunction of them 
all taken together. In addition, the overall circuit will impose the usual 
conservation constraints (currents summing to zero, etc). 

The act of hiding a node for an analog circuit involves two conceptually 
distinct operations: 

— Disconnecting it, that is, asserting that there will be a zero net current, 
and 

— Disregarding it, that is, ignoring the voltage present at the node. 
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The first is carried out by substitution, the second by existential quan- 
tification. For example, if spec is the specification of an n + 1 terminal 
component, then the specification, spec', of the component with the last 
terminal hidden is 

spec' . . .,Vn-l,in-l) = 

let = 0 in 3vn- spec 

Serial and parallel composition Using the above rules, we can define a 
combinator, -H- , that takes the specifications, p and q, of a pair of 2- 
terminal components and yields the specification, p ++ g, of their serial 
composition: 

( ++ ):{V X I ^ propY {V X I ^ prop) 

{p ++ q) (v,i) = 3 vi,V2- p {v\,i) Aq (v 2 ,i) A vi + V 2 = v 

Likewise, we can define the dual notion, the combinator || for parallel 
composition: 

( II ):{V X I ^ prop)"^ {V X I ^ prop) 

(p \ \ q) (v,i) =3ii,i2 - p (v,ii) A q (v,i 2 ) Aii +i 2 =i 

Identities The device specifications and combinators described above satisfy 
a number of useful identities. For example: 

— The combinator || is commutative and associative, and has open as 
its unit: 

p \\ q = q \\ p, p 1 1 (g 1 1 r) = (p 1 1 g) 1 1 r , p 1 1 open = p. 

— The dual combinator, -H- , is likewise commutative and associative, 
and has closed as its unit. 

— Voltage and current souces, when appropriately composed, add linearly: 
voltage vi ++ voltage V 2 = voltage (vi +V 2 ), 

current i\ || current i 2 = current {i\ +^ 2 ), 



2.4 Correctness 

The condition that an implementation with a derived specification imp is a 
correct implementation of a given requirements specification, reg, is that any 
tuple, X, of signals that satisfy the derived specification should also satisfy 
the requirements specification. That is, the implementation is correct if the 
following verification condition holds: 

\/x. imp X => req x 

that is, if imp □ req. 

As a simple example, the serial composition of an ideal 20 f2 resistor and 
a 27f2 one is a correct implementation of a requirements specification for a 
50f2 ±10% actual resistor, since 

Vn,T {res 20 ±± res 27) (v,i) => Res 0.1 50 
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3 Rectilinear specifications 

In general, a behavioural specification of an analog component is an entirely 
arbitrary predicate on the voltages and currents at its ports. This means 
both that it can be difficult to validate (ie, confirm that it captures the in- 
tended notion) and that there will not be any decision procedures available 
to aid reasoning about it. 

We have, however, observed that, for analog implementations of digi- 
tal devices, the task of establishing correctness (and other useful proper- 
ties) only relies upon having a relatively eoarsely quantified approximation 
to these analog behaviours. We exploit this fact by restricting the class of 
specifications we consider to a syntactically defined subset of all possible 
specifications. As we shall demonstrate, by a judicious choice of this subset, 
we can simplify the task of reasoning about correctness (and related prop- 
erties) to a point where it becomes computationally decidable, without any 
signifieant loss in the range of implementations that ean be proven eorreet. 

In this section we discuss reetilinear speeifieations, a restricted form of 
specification that meets this requirement. We term a specification reetilinear 
if: 



— Its atomic formulae are restricted to (in)equalities involving at most 
one parameter; and 

— No bound variable occurs in more than one atomic formula in which a 
parameter is present^. 

The graph of a rectilinear specification consists of regions bounded by hy- 
perplanes which are parallel to the coordinate axes. 

The class of rectilinear specifications is closed under the propositional 
connectives and quantification, and so the derived specification of a circuit 
whose components are specified by rectilinear specifications will itself be 
rectilinear. 



3.1 Examples 

Bipolar junetion diode Consider a rectilinear specification for a bipolar 
junction diode (a component widely used in implementations of digital de- 
vices). From the point of view of the digital designer, its behaviour can be 
specified as the conjunction of four subspecifications. These cover each of 
the following characteristic modes of operation: 

1. When it is reverse biassed (or hard off}\ 

2. When it is forward biassed, but not yet conducting significantly; 

^ This restriction prevents bound variables being used to implicitly allow more 
than one parameter to figure in an atomic formula as, for instance, in a (non- 
rectilinear) specification like: spec (w, i) = 3v' . {v = v) A (v = i x r). 
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3. When it is forward biassed and conducting an indeterminate amount; 
and 

4. When it is forward biassed and conducting strongly (or hard on). 

This leads (see Fig. 6) to a rectilinear specification of the form: 

diodcri v\ V 2 ii *2 *3 {v,i) = 

{{v < 0) ^ (ii < i < 0)) A 
((0 < u < vi) => {0 <i < * 2 )) A 
((ui < u < U 2 ) (0 < i)) A 

{{v2 <v)^ {is < i)) 

Notice that the overall specification consists of a conjunction of four sub- 
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Fig. 6. Graph of a rectilinear specification of a junction diode. 



specifications, and that each subspecification consists of an implication 
whose hypothesis delimits a region of the parameter space within which 
the consequent constrains the allowable range of behaviours. 

This specification can be specialised to give the specification obeyed by 
a typical silicon signal junction diode: 

typical-diode = diode 0.3 0.6 — 10“® 10“® 0.01 

Bipolar junction transistor Digital circuits make use of both bipolar junc- 
tion transistors (BJTs) and field effect transistors (FETs). In general, the 
former are more difficult to reason about since their control electrode (gen- 
erally the base) imposes a significant current loading on the circuit (by 
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contrast, the gate electrode of a FET can be considered to be an open 
circuit). The examples given in this chapter all refer to BJTs. 

The description of the behaviour of a BJT [7] is generally partitioned 
into four modes (forward active, saturation, cutoff and reverse active) ac- 
cording to the relative polarities of its three electrodes. In TTL technologies 
(described later), all four modes of behaviour can occur. From the perspec- 
tive of an analog designer, its detailed behaviour in each of these modes 
of behaviour is complex. However, just as with the diode, a broad-brush, 
conservative approximation to each mode of behaviour turns out to provide 
sufficient information to characterise its behaviour from a digital perspec- 
tive. For example, when a typical BJT is operating in the cutojf mode (as 
defined by both its junctions being reverse biassed) , the current through its 
collector electrode can simply be characterised as being between zero and 
In A; no further precision is necessary. 

The specification for each of the four modes of operation of a BJT takes 
the form of an implication, with the hypothesis defining a region in the 
operating space and the conclusion specifying its behaviour in that region. 
The overall specification is then simply the conjunction of these partial 
specifications. 

4 Specification at the digital level 

Our aim in this section is to formulate specifications, at the digital level of 
abstraction, for the steady-state behaviour of ordinary gates. This appar- 
ently trivial task needs to be handled with care in order to avoid inconsis- 
tency arising when these specifications are related to the corresponding ones 
at the analog level of abstraction. We illustrate the approach by considering 
the specification for a 2-input And gate. 

Our starting point is the set B = {t, f } of ideal digital signal levels to 
which we add a third element, n, to allow non- digital signal levels to be 
represented, giving the set T A {t, n,f} of non-ideal signal levels. 

Since the output of an And gate with signals of t and n at its inputs 
could be any of t, n or f, a relational description is required; a functional 
one would involve arbitrary overspecification. The specification, a predicate 
on the tuple of signals at the inputs and output of the gate, is of type 

andReq: x T prop 

Evidently, it needs to be compatible with the Boolean algebra function 
and : B^ B. The weakest relation on x T with this property is 

andReqi ({x,y),z) = if (x ^ n) f\ (y ^ n) then z = and (x,y) 

In practice, however, designers invariably assume a stronger specifica- 
tion than this; they also assume the characteristic non-strict property that 
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f acts as a zero element: 

andReq^ {{x, y), z) = {x = i) V {y = i) ^ {z = f) 

Taking the conjunction of these two predicates yields the overall specifica- 
tion for an And gate: 



andReq = andReq^ A andReq 2 

This can equivalently be expressed as 

andReq {{x,y),z) = 

{{x = t) A (y = t) ^ = t)) A 

{{x = f) V (y = f) ^ = f)) 

Relational specifications for other types of gate can be derived in a 
similar way. For example, the specification for a Nand gate is 

nandReq: x T prop 

nandReq {{x,y),z) = 

{{x = t) A{y = t) ^ {z = f)) A 

{{x = f) V (y = f) ^ = t)) 

5 Analog level behaviours 

We now describe how the specifications of digital devices can be formulated 
at the analog level (where both voltages and currents are involved), how 
the specifications of implementations can be derived, and how correctness 
conditions are expressed. We illustrate the discussion with examples based 
on TTL (transistor-transistor logic), a technology based on bipolar junction 
transistors (and therefore a challenging one to handle formally). 



5.1 Specification at the analog level 

We begin by describing how device specifications expressed at the digital 
level of abstraction (such as the Nand gate specification just defined) can 
be mapped down to the analog level of abstraction. There are two aspects 
that have to be considered: the abstraction function that defines the relation 
between signals at each of these levels and the loading predicates that define 
the loading a gate is allowed to impose on its environment and the amount 
of drive it can supply to its environment. 

Voltage abstraction function The voltage abstraction function relates ana- 
log voltage levels to digital signal levels. For a standard TTL technology 
(and assuming the true-as-high convention) voltages below 0.4V represent 
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false and ones above 2.4 represent true. Thus, the abstraction function for 
TTL is: 



a:V 



a V = if V 0.4 tliGii f gIsg 
if V ^ 2.4 then t else 
n 



Voltage-only device specifications Using the abstraction function, a device 
specification defined at the digital level can be mapped down to a voltage 
specification at the analog level (notice the contravariance: whilst the ab- 
straction function maps analog signals up to digital ones, it maps digital 
specifications down to analog ones). 

As an example, the digital-level Nand gate specification, nandReq, de- 
fined earlier, maps to the analog-level specification, NandReq, defined by 

NandReq: ^ V ^ prop 

NandReq (vi,V 2 ) v = nandReq {{a vi),{a V 2 )) {a v) 

This specification, however, does not adequately characterise the behaviour 
we require of a gate since it makes no reference to current flows. For in- 
stance, it implies that forcing a Nand gate’s output terminal low would 
unconditionally prevent either of its inputs terminals from going low. Such 
a device would be rather difficult to realise. To obviate such difficulties, it 
is necessary to take account of currents as well as voltages. 

Loading of transistor-transistor logic (TTL) We refer to the relation be- 
tween the voltage and current at a device’s terminals as the load it imposes 
(or, equivalently, as the drive it can supply). There are two aspects of a 
digital device’s behaviour that we need to be able to characterise: 

— The load an input terminal is allowed to impose on the gate’s environ- 
ment. We use a predicate 

unitLoad: (V x I) ^ prop 

to characterise this. A definition of this predicate for a standard TTL 
input terminal is shown in Fig. 7(a). 

— The drive an output terminal is required to be able to supply to the 
gate’s environment whilst still maintaining the intended digital signal 
level (that is, whilst still satisfying the gate’s voltage-only specification). 
We use a predicate 



stdLoading: (V x I) ^ prop 

to characterise this. A definition of this predicate for a standard TTL 
output terminal is shown in Fig. 7(b). 




Reasoning about Imperfect Digital Systems 347 



Using these two loading predicates, we can now define behavioural spec- 
ifications at the analog level. For example, the specification for a Nand gate 
(with terminals as shown in Fig. 8) takes the form: 

NANDreq: {V x 7)^ {V x I) ^ prop 

I^AJVDtCQ {{Vim : out out) — 

unitLoad (vini,iini) A unitLoad (vin 2 ,iin 2 ) A 
(stdLoading (voutAout) ^ NandReq Vout) 

This specification states that the load a Nand gate imposes on each of 
its input terminals must satisfy the unitLoad specification and that, pro- 
vided the load the environment imposes on the output terminal satisfies 
the stdLoading specification, the voltages on the gate’s input and output 
terminals will satisfy the voltage-only specification for a Nand gate. 



5.2 Implementation at the analog level 

A typical TTL implementation for a Nand gate is shown in Fig. 9. Assuming 







out 



Gnd 



Fig. 9. A typical implementation, in TTL technology, of a Nand gate. 



that Ql is the specification for transistor Ql, 7?1 the specification for re- 
sistor Rl, and so on, the derived specification for this circuit^ takes the 

^ A complete derived specification for a similar circuit is given in the Appendix. 
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form: 

NANDimp: (V x 7)^ {V x I) ^ prop 

^ANDimp ) 7 (^in2 7 ^*^2 )) {'^outAout^ — 

3r;i,r;2, ...,Vn-V. 

3^1 , ^2 7 • • • 7 ^ • 

Qi ( '^ini 7 ‘^ini 1 '^in‘2 : ^m2 7 ^'l7*l7^'2 7*2) A 

7?1 (Vcc - Vi,il) A 



(That is, the voltages and currents at the terminals of each device obey the 
behavioural specification for the device and, in addition, the currents sum 
to zero at nodes and voltage differences sum to zero round closed paths.) 



5.3 Verification at the analog level 

The verification condition asserting that the behaviour of this circuit sat- 
isfies the behavioural specification for a Nand gate is simply NANDimp □ 
NANDreq, or, in first-order form: 

7 '^in2 7 '^out • 1 ^- 
7 7 ^out ' I • 

NANDimp {{VimAini\{'^in2Ain2^^ {'^outAout^ ^ 
NANDtCQ {{Vim Aini\ {'^in2 Airi2^^ {'^outAout^ 

There are two approaches that can be used to establish the truth of 
verification conditions of this form. One is to use conventional theorem- 
proving techniques, the other is to use model-checking techniques. 

An account of a theorem-proving approach was presented in [5] . Whilst 
this approach is certainly possible, it turns out to be highly labour-intensive. 
Essentially it involves attaching assertions to the output terminals of an 
implementation and then pushing them back through the circuit (using the 
device specifications as predicate transformers) to the input terminals. 

The model-checking approach essentially involves constraint satisfaction 
techniques. In principle, a general purpose constraint satisfaction program 
(such as CLP(R) [3]) could be used. In practice, such an approach turns out 
to be hopelessly inefficient. A specialised decision procedure has therefore 
been tailored to the characteristics of this particular kind of problem. 

6 Decision procedure 

The verification condition we wish to establish is of the form 

Ve. imp[e] => req\e] 



where: 
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— e is the tuple of external variables (for instance, in the verification 
condition for the Nand gate (above), the tuple comprises the variables 
ttmi ■}■■■■} lout')] 

— imp[e] is the derived specification of the implementation; 

— req[e] is the requirements specification. 

In general, a derived specification, will be of the form 3i. implem[e,i] where 
i is a tuple of internal variables in the implementation (for instance, the 
variables ui, . . . ,n„ and i\,. . . ,im in the NANDimp specification). Thus, 
the verification condition can be cast in the form 

Ve. (3i. implem[e,i]) => req[e] 

By moving the existential quantifier to the outer level (where it becomes 
a universal), and then by replacing the universal quantifier by a negated 
existential formula, the verification condition can be cast in the form 

-i3e,T implem\e,i] A ^req\e] 

By using simple propositional reasoning, the body of this formula can then 
be expressed in disjunctive normal form: 

^i[e,i]) V ... V 

where each subformula 4>j is a conjunction of literals, and each literal is a 
simple linear (in)equality in the external and internal variables. 

Finally, by distributing the existential quantifier over the disjunction, 
the verification condition is converted to the form 

^{(3e,i. 4>i[e,i]) V ... V {3e,i. 4>n[e,i])) 

The verification condition can now be established simply by showing that 
each of the n formulae, ^i[e,i], . . . has no feasible solution. 

6.1 Practical implementation 

An algorithm to implement the above decision procedure has been written 
in Haskell (a widely used functional programming language) . The algorithm 
takes as its input a set of definitions of the behaviours of the primitive com- 
ponents (resistors, diodes, transistors) and a statement of the verification 
condition (an example is shown in the Appendix), and goes through the 
following steps: 

1. The input, which syntactically takes the form of a single formula (struc- 
tured by liberal use of let bindings), is parsed. This yields an abstract 
representation of the verification condition. 

2. Beta reduction is used to eliminate all let bindings, that is, to replace 
locally defined names with their definitions and symbolically evaluate 
all applications. 
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3. Any existential quantifiers present in the hypothesis of the formula are 
moved outwards to the top level (where they become universal quanti- 
fiers). In turn, all universals are rewritten as negated existentials. 

4. The body of the formula is converted to disjunctive normal form and 
the existential quantifiers are distributed over the disjunction. 

5. Each separate disjunct (which itself corresponds to a conjunction of lit- 
erals, each one of which is a linear inequality) is converted to a standard 
linear-programmming matrix form. 

6. Each such set of matrices is submitted for analysis to a standard Eor- 
tran linear-programming library routine (routine E04MBF in the NAG 
library) . This routine quickly determines whether the set of inequalities 
has a feasible solution. 

If none of the sets of inequalities is found to have a feasible solution, then 
the original verification condition is true. Conversely, if a feasible solution 
is found, then this solution provides a counterexample demonstrating the 
falsity of the verification condition. 

In addition to the steps outlined above, the algorithm also incorporates 
a number of optimisations. Eor instance, in step (5), a quick test for the 
presence of opposing literals (that is, pairs of the form ... A (a; > a) A ... A 
{x < a) A . . .) allows many subformulae to be eliminated without the need 
for submission to the linear-programming routine. 



7 Conclusions 

The overall aim of the work described here has been to extend existing 
predicate-logic based methods used for specifying and verifying systems de- 
fined at the “ideal digital” level of abstraction down to the analog level. This 
allows digital systems that (perhaps in order to gain speed or economise 
on power consumption) have been implemented in part at the analog level 
to be specified and reasoned about with the same degree of certainty and 
transparency as those implemented wholly at the digital level. 

The method described is straightforward; it simply involves using pred- 
icates to characterise the behaviour of analog components in terms of the 
voltages and currents at their terminals. A central tenet of the approach is 
that these characterisations should be conservative approximations to the 
actual behaviours. Based on detailed studies, in particular of implemen- 
tations of TTL technology, it has been found that these approximations 
can be remarkably weak and yet still capture enough of the essential be- 
havioural properties of an analog device to allow successful verification at 
the digital level of abstraction. Perhaps this should come as no surprise; 
after all, digital designers habitually reason at the analog level simply in 
terms of devices being ‘on’ or ‘off’ and of voltage levels being ‘high’ or ‘low’, 
and so on. 
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An algorithm for checking the verification conditions that arise with this 
approach has been programmed and has been used to verify, automatically, 
simple TTL circuits. The computational time the algorithm takes has, how- 
ever, been found to increase sharply with circuit complexity. At present it 
is not clear whether the computational task is an inherently hard one or 
whether a different algorithm would be less sensitive to circuit size. Irre- 
spective of the algorithm used, one approach to limit the computational 
time is to partition an implementation into a set of smaller ones by in- 
troducing intermediate specifications (in effect, lemmas). This seems to be 
the approach the human designer uses when confronted with a complex 
implementation. 
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9 Appendix 

Here is a simple example of an input to the verification condition checking 
algorithm. Overall, it consists of a single formula, the verification condition 
to be verified, viz \fu,v,i,j. imp u i v j => req u i v j. This is prefixed 
with a number of let bindings that define the predicates imp (the derived 
specification of the proposed implementation, a two-stage transistor buffer) 
and req (a simple requirement specification). In turn, these bindings are 
prefixed by bindings that define predicates such as res (for resistors) , diode 
and trans (for NPN bipolar junction transistors). Throughout, currents and 
voltages are expressed in amps and volts. 

Note: in the interests of brevity, some of the bindings have been edited out, 
but all follow a similar pattern. 



Diode model 



v,i 



l\ 

I \ 

— I > 
I / 
1/ 



let diode v i = 

let il = -l.OE-6; 
v2 = 0.3; 

12 = l.OE-6; 

v3 = 0.6; 

13 = lO.OE-3 



in 

(v < 0.0 => (i < 0.0 & i > il)) & 
((v < v2) => (i < i2)) & 
((v > v3) => (i > i3)) in 



Transistor model 



/ 



1/ 



NPN transistor 



V VC, ic 
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vb , ib 



l\ 

X 



+ 



let hardOn v i = 

let vl = 0.1; 

il = lO.OE-3 
in 

(v > vl) => (i > il) in 

let hardOff v i = 

let il = lO.OE-6 
in 

i < il in 

let trans vb ib vc ic = 

let ibl = l.OE-6; 

ib2 = l.OE-3 
in 

diode vb ib & 

(ib >= ibl => hardOn vc ic) & 

(ib <= ib2 => hardOff vc ic) 



in 



— The requirement specification 



+ + 

I I 

va ia I I ib vb 

* — > I req I < — * 

I I 

I I 

+ + 
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-+ — 



— +- 



— + — 



let req va ia vb ib = 

ia > l.OE-3 => vb < 1.0 in 



— The implementation specification 



+ + + 

I 

R1 



— +- 
I 

R3 



Vcc 



V i2 i4 V i8 i9 ill 








v2 + 

V i3 


->-+ 

1 


v3 + 

V i7 


->-+ 

1 


— 


il 


1 


1 i6 


1 


V ilO 


-- 


*->- 


— T1 


+ — >- 


- T2 


1 


— 


vl 


1 


1 


1 


R4 


— 




1 


V i5 


1 


1 


-- 




1 


R2 


1 


1 


— 


+ 


1 


1 


1 


1 



let imp vl il v3 ill = 
let Vcc= 5.0; 



R1 = res 4.7E3; 
R2 = res 10.0E3; 
R3 = res 4.7E3; 
R4 = res 10.0E3; 
T1 = trans ; 

T2 = trans in 



?v2 i2 i3 i4 i5 i6 i7 i8 i9 ilO. 



T1 vl il v2 i3 & 
R1 (Vcc - v2) i2 & 
i2 = i3 + i4 & 
R2 v2 i5 & 
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T2 v2 i6 v3 i7 & 

R3 (Vcc - v3) i8 & 

18 = i7 + i9 & 

R4 v3 i9 & 

19 + ill = ilO in 



— The Correctness Criterion 



! u V i j . 

imp u i V j => req u i v j 
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Abstract. Statecharts extend the concept of Mealy Machines by par- 
allel composition, hierarchy, and broadcast communication. While Stat- 
echarts in principle are widely accepted in industry, some semantical 
concepts, especially broadcasting, are still contested. In this contribu- 
tion, we present a Statechart dialect that includes the basic concepts of 
the language and present a formal, relational semantics for it. We show 
that this semantics can be used for both formal verification by model 
checking and hardware synthesis. 



1 Introduction 

Statecharts [12] are a visual specification language proposed for specifying reac- 
tive systems. They extend conventional state transition diagrams with structur- 
ing and communication mechanisms. These mechanisms allow the description 
of large and complex systems. Due to this property and their support by a 
number of tools, Statecharts have become quite successful in industry. The full 
Statecharts language, however, contains many mechanisms that cause problems 
concerning both their syntax and semantics. An overview of these problems can 
be found in [28]. 

In this paper, we describe a dialect of Statecharts, called /^-Charts. In contrast 
to Statecharts as defined by Harel [12] but similar to Argos [18, 19], /i-Charts 
can be clearly decomposed into subcharts. Thus, they can be developed in a 
fully modular way by simply sticking them together. /^-Charts are restricted 
to the most essential constructs. The basic components are sequential, non- 
deterministic automata. /i-Charts can be composed in parallel and hierarchically 
decomposed. 

Argos and the approach followed in [14] provide steps in the same direction. Our 
work extends their approaches in many respects. 

* This work has been partially sponsored by the NADA Esprit Working Group 8533 
and the BMBF project “KorSys” . 



B. Moller and J.V. Tucker (Eds.): Prospects for Hardware Foundations, LNCS 1546, pp. 356-389, 1998. 
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First, Argos does not offer data variables. Second, although Argos also uses 
an explicit feedback operator for communication it does not generate causality 
chains for signals as we do. Instead, it simply computes the set of broadcast 
signals by solving signal equations. In our semantics, however, each system step 
consists of a number of micro steps. In each such micro step further signals for 
broadcasting in the current system step are added until a fixed point is reached. 
In every single micro step we check whether the set of broadcasting signals is 
consistent. 

Furthermore, for Argos in [19] non-determinism is introduced by using external 
prophecy signals. This is different to the approach we follow. In our semantics, 
those prophecies are not necessary since non- determinism is treated internally. 

Finally, while in the Statechart version described in [14] data variables are avail- 
able, it is not possible at all to construct larger charts by simply sticking com- 
ponents together as in Argos or /^-Charts. 

Although /i-Charts are powerful enough to describe large and complex reactive 
systems, we assign a concise, formal semantics to them. It is given in a fully 
mathematical way, based on the specification methodology FOCUS. Therefore, 
we can mix pure functional FOCUS specifications [5, 11, 23] with /i-Charts. The 
main intention of this paper is to demonstrate 



— how to restrict and modify the syntax of traditional Statecharts [12] in order 
to get a modular specification language, 

— that /i-Charts are not a toy language but can be used to specify, verify, and 
implement practical systems. 



Since our semantics is based on the relational yu-calculus, we do not only define 
a theoretical semantics, but can also apply existing /x-calculus verification tools 
as a basis for hardware generation and model checking. 

This contribution is structured as follows. Section 2 gives an informal introduc- 
tion to our language. Syntax and semantics are explained in detail in Section 3 
and 4, respectively. In Section 5 we outline how the /i-calculus encoding can be 
used for formal verification. Finally, in Section 6 we present a hardware design 
scheme using /^-Charts. 



2 Introduction to /x-Charts 



Before we formally describe syntax and semantics of our dialect we first give 
some brief informal explanations together with an example, a central locking 
system for cars. 
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2.1 Motivation 



A /i- Chart is a specification of a component in a reactive system that demon- 
strates a cyclic behavior. In each cycle, an input is read, an output is emitted, 
and the component state may change. In this respect, /i-Charts are similar to 
ordinary Mealy machines: they have a finite set of states, an initial state, an 
input and output alphabet, and a transition relation defined over current state, 
input signal, output signal and next state. 

For reactive systems, the input alphabet can be regarded as a set of signals 
generated by the systems’ environment that are relevant for the operation of the 
system. Similarly, the output alphabet usually consists of events that influence 
the future behavior of the environment. 

The time between two cycles is non-zero and finite [1]. Between two cycles in 
the environment more than one event can occur that might be of interest to the 
system. The idea is then that the signals for all these events are collected in a set, 
and this set is used as input to the component. The output is then also defined to 
be a set of signals. Syntactically, this means that transition labels are not simply 
signal pairs, but consist of a Boolean expression that characterizes triggering 
input signal sets, and a set of output signals. Of course, the reaction itself in 
practice also consumes time, and, theoretically, events in the environment could 
be lost if they occurred during the system reaction. If, however, the system is 
sufficiently fast in comparison to the environment, we can disregard the reaction 
delay, and arrive at the synchronous time model illustrated in Figure 1. Since 
system reactions are assumed to be infinitely fast, they just divide time flow into 
finite intervals; at each interval border there is a system reaction where input is 
read and output produced. 



I/O I/O I/O I/O I/O I/O I/O 

— I 1 1 1 1 1 1 ► Time 



Figure 1. Synchronous time model 



Even when input and output is generalized to sets of signals, however. Mealy 
machines are not practical as a specification formalism for reactive systems. 
The reason is simply that using state machines for specifications often yields 
diagrams that are too large to be written down or comprehended. For this 
reason, Statecharts were suggested by Harel in [12]; they combine the operational 
notions of Mealy machines with graphical devices to concisely describe large state 
spaces. 

Basically, Statecharts are defined inductively as follows: 
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— A Mealy machine where input and output alphabet are generalized to pow- 
ersets is a Statechart. 

— The parallel composition of two Statecharts is a Statechart. Parallel compo- 
sition is the main technique to reduce the number of states needed for the 
specification. 

— A Mealy machine, where one state is replaced with a Statechart is itself a 
Statechart. This construction is called hierarchical decomposition, and is the 
main technique to reduce the number of transition arrows needed for the 
specification. 

In the rest of this section, we explain parallel composition and hierarchical de- 
composition in more detail. 



Parallel composition and communication. The main technique to reduce com- 
plexity in the specification is the parallel composition of two or more state ma- 
chines. The state space of the parallel composition is the product of the state 
spaces of the sub-state machine, yet the size of the specification grows only 
linearly. 

Intuitively, the components composed in parallel operate independently: for each 
input signal set, each component makes a transition and emits an output signal 
set. The output of the composition is the union of the component outputs. 

Unfortunately, only rarely can a component be specified by the independent 
composition of smaller specifications. In practice, the state machines composed 
in parallel should be able to communicate. To avoid the clutter resulting from 
communication channel names, broadcast communication can be introduced. 
Whenever a state machine emits an output signal, it is visible to the other ma- 
chines; there it can then cause further outputs, and so on. Thus, communication 
can lead to chain reactions of machine transitions. Only the result of the chain 
reaction with the accumulated output is then the visible reaction of the system. 

Together with our synchronous time model, communication can cause causality 
conflicts. For example, assume that a machine M\ produces an output b if and 
only if it receives input a, machine M 2 produces output a only if it receives 
input h, and machine M is the parallel composition of Mi and M 2 with internal 
communication of a and b. When neither a nor b is input from the environment, 
should the output of M be the set {a, b} or the empty set? 

Plausible semantical definitions of communication are quite intricate, and they 
are the major difference between the various Statechart dialects found in the 
literature. The communication semantics of our dialect yu-Charts is defined in 
Section 4.7. 



Hierarchical decomposition. The second technique to reduce the complexity of 
a specification is the introduction of hierarchy: Groups of states with common 
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transitions can be gathered in a sub-chart (Figure 2). This way, the number of 
transitions needed for a specification can be reduced. 




(a) Flat state machine 




(b) Hierarchical state ma- 
chine 



Figure 2. Hierarchical decomposition 



In /r-Charts, as in most Statechart dialects, however, hierarchical decomposition 
is not only employed to cluster states with identical transitions. In addition, 
hierarchy is also used to model preemption. Our semantics realizes weak pre- 
emption. Weak preemption means that the sub-chart still has the chance to 
fire a transition before becoming inactive whenever the corresponding master 
withdraws the control. Although weak preemption can be used to model strong 
preemption the converse is not true; strong preemption can be derived from 
weak preemption by signal negation. Thus, we choose weak preemption as de- 
fault mechanism because it is the more general concept [2]. 

Hierarchical decomposition, too, gives rise to interesting semantical questions. 
For example, what should be the proper behavior when the signal necessary to 
leave the hierarchy is produced within it? 



2.2 Example 



Figure 3 shows how the ideas of the previous section can be made concrete. The 
example models the central locking system of a two-door car. Table 1 gives an 
overview of the signals used. The signals marked external are inputs from the 
environment to the system, those marked internal are signals that are produced 
by the system; they are the system’s output. 

Our central locking system consists essentially of three main parts: Control 
and the two door motors. These parts are composed in parallel. Locking and 
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unlocking the doors leads to complex signal interactions. The default configu- 
ration of the system is that all doors are unlocked (Unld) and both motors are 
Off. Now the driver can lock the car either from outside by turning the key 
or from inside by pressing a button. Both actions generate the external signal 
c. The Control unit generates the internal signals Idn and rdn and enters its 
locking state Lockg, which is decomposed by the automaton in Figure 4. 

Instantaneously, infiuenced by Idn and rdn, respectively, both motors begin to 
lock the doors by entering their Down states. Those states are decomposed by 
the sequential automata pictured in Figure 5. Thus, the motors are additionally 
in their Start states. As the speed of the motors depends on external infiuences 
like their temperature, each motor either needs one or two time units to finish 
the lowering process. Only when both have sent their ready messages Imr and 
rmr, the Control enters the Both state and produces the signal ready. The 
effect of this signal is twofold: on the one hand the Control terminates itself 
immediately and enters the Locked state. On the other hand also both motors 
are triggered by this signal and are switched Off. 

In our syntax communication is expressed by an explicit feedback operator. It 
is graphically indicated by the box on the bottom of Figure 3. 

Whenever the crash signal occurs and the ignition is on, the Control changes 
from the Normal mode to the Crash mode and generates the signals lup and 
rup. 



Signal 


Meaning 


Source 


crash 


Crash sensor 


External 


0 


Opened with external key 


External 


c 


Closed with external key 


External 


ignition 


Ignition on 


External 


Imr 


Left motor ready 


Internal 


rmr 


Right motor ready 


Internal 


lup 


Left motor up 


Internal 


Idn 


Left motor down 


Internal 


rup 


Right motor up 


Internal 


rdn 


Right motor down 


Internal 


ready 


Un-/Locking process ready 


Internal 



Table 1. Signals used in the locking system 



Note that our sequential automata may contain nondeterminism. Nondetermin- 
ism is introduced whenever there are two transitions from a single state with 
nonexclusive trigger conditions. In our example, for instance, the automaton 
MotorLeft contains nondeterminism: if both signals Idn and lup are input 
while the automaton is in state Off, the transition to Down or the transition 
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Control 





lup, Idn, Imr, rup, rdn, rmr, ready 



Figure 3. Central locking system 



to Up may be taken. 



3 Syntax 

In this section, we formally define a textual syntax for /^-Charts. It corresponds 
to the graphical syntax used in the central locking system example in Section 2. 
The language /i-Charts is based on Mini-Statecharts, as first presented in [21] 
and later refined in [24, 25, 27]. We only repeat those concepts that are a 
prerequisite for the extension to nondeterminism and assume the reader to be 
familiar with the principles of hierarchical, interacting state machines. 

Throughout this paper, M denotes a set of signal names. States a set of state 
names, and Ident a set of identifier names for sequential automata. For any 
chart, only a finite number of signal, state, and automata names can be used; 
p{X) denotes the set of finite subsets of some set X. 

In addition to the constructs used in the example, we allow sequential automata 
to have local (state) variables. Transition actions may then consist of simple 
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Figure 4. Decomposition of Lockg and Unlg 




Figure 5. Decomposition of Down and Up , a; G {l,r} 



imperative programs. 

Hence, let U be a set of local integer-valued variables. Note that local variables 
were not needed in the specification of the central locking system. V has to be 
disjoint from the sets introduced so far. The other syntactic sets associated with 
a simple language for transitions (borrowed from [30] and adapted for our pur- 
poses) are: integers Int, truth values Bool = {true, false}, arithmetic expressions 
Aexp, Boolean expressions Bexp, and commands Com. 

The set of /^-Charts S is defined inductively. A /x-Chart is either a sequen- 
tial automaton, a parallel composition of two yu-Charts, the decomposition of 
a sequential automaton’s state by another /x-Chart, or the result of a feedback 
construction for broadcasting. The inductive steps are motivated and defined in 
Sections 3.1 to 3.3. 



3.1 Sequential Automata 

Sequential automata Seq(A, V) , P 4 , S, ( 74 , a, S) are the basic elements of our Stat- 
echart dialect. They consist of: 

1. A G Ident is the unique identifier of the automaton. 

2. V) C U is a set of local variables of the automaton. All variables in V) can 
only be read or written by operations on transitions of the automaton itself. 
Other automata do not have any read or write access on them. 
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3. Whenever the automaton is initialized, that is, starts in its default state (T^, 
also every local variable in V/ is initialized according to the initialization 
function : V/ Z. 

4. S C States is a nonempty finite set of all states of the automaton. 

5. (Td £ S represents the default state. 

6. (7 £ S represents the current state. 

7. S : S X p{M) p{E x Ccnri) is the finite, total state-transition relation that 
takes a state and a finite set of signals and yields a set of next states paired 
with a finite set of commands. If this set contains more than one pair, the 
automaton is nondeterministic; if the set is empty, the automaton cannot 
react to the current input when it is in state o. In the latter case, we assume 
that the automaton remains in its current state and does not produce any 
output signals. 

In our concrete syntax (see the example), we use a Boolean term t instead of a 
set of signals x £ p{M) as trigger. It is straightforward to translate a partial 
transition function that deals with arbitrary Boolean terms as trigger conditions 
into a set-valued total function (see for example [27]). 

A transition takes place in exactly one time unit (see Section 2.1). In a spec- 
ification with several automata working in parallel, more than one automaton 
can make a transition; all transitions taken in parallel automata are assumed 
to occur in the same time unit. Note, however, that every single sequential au- 
tomaton only is allowed to make one transition in one instant. The set of all 
system actions in one time unit is called a step. 



Transition Syntax In this section, we introduce the syntax of yu-Charts with 
local variables and value-carrying signals. In this paper, we only consider integer 
numbers as data values; however, the language can easily be extended to other 
datatypes and operators. 

In presenting the syntax of our transition language we will follow the convention 
that n £ Int, X £ V , E £ M, a £ Aexp, b £ Bexp, and c £ Com. Note that 
Boolean expressions are only used in commands and not as trigger conditions. 
Arithmetic/Boolean expressions and commands are formed by: 

a ■.■.= n\X\ai binop 02 

b ::= true | false | oi equ 02 | ai leq 02 | not b \ b\ and 62 
c ::= skip \X := a\E\\f b then ci else C 2 fi | ci, C 2 

Here binop stands for any of the usual binary operations on integers, such 
as addition or multiplication. As usual, if b then c fi is an abbreviation for 
if b then c else skip fi, and single skip commands can be omitted. For example 
in Figure 7 we simply write s A -r and r instead of s A -r/skip and r/skip, 
respectively. 
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The meaning of Boolean and arithmetic expressions is straightforward whereas 
the meaning of commands needs some explanation. X := a assigns the value of 
the arithmetic expression a to the variable X. In this context, X is local with 
respect to the automaton that contains the transition. E generates the pure 
signal E as output in the current instant. E is said to be present in the current 
step only. In the very next step, E is absent again unless it is generated afresh. 
The if-then-else construct is used as usual. 

The command ci , C 2 stands for the parallel composition of the two commands 
Cl and C 2 . This construct has to be treated with some care. In traditional 
Statecharts, when two or more commands want to change the same variable in 
the same step so-called race conditions [13] can occur, which have to be detected 
by Statemate’s simulation and dynamic test tools because the values of the 
variables are unknown before runtime. 

In previous work [24, 25], to get a more deterministic behavior, we chose se- 
quential execution ci; C 2 instead of parallel composition. However, the program 
counter needed for sequential execution makes both hardware implementation 
and formal verification more complex. 

As a consequence, we have now decided to realize a compromise between the 
two above mentioned approaches. Instead of the sequential composition ci;c 2 
proposed in [24, 25] we use the parallel composition ci,C 2 as in [13], but avoid 
race conditions by the weak Bernstein eondition. 

Bernstein’s condition demands that each common location must not occur on 
the left-hand-side of an assignment. In our context possible locations are local 
variables. Possible assignments are of the form X := a. The following composi- 
tion is an example: “A" := 1, if A" = 2 then Y := 3 else Y := 4”, abbreviated by 
ci,C 2 . The unique common variable of ci and C 2 is A". However, in ci we have 
a write access to X and therefore ci,C 2 does not fulfill Bernstein’s condition. 
Such commands are not valid with respect to this condition in our setting. Note 
that a procedure that decides whether a command is valid or not is based on the 
syntax only and can easily be implemented and therefore is part of the static 
analysis. 

Note further that Bernstein’s condition implies commutativity, i.e. commuta- 
tivity is a weaker condition than Bernstein’s. Hence, the meaning of parallel 
composed commands does not depend on their sequential ordering. 



Example As an example of a yu-Chart with local variables we consider a three- 
digit stopwatch. Each digit is implemented as a seven-segment display as shown 
if Figure 6. The stopwatch is part of a benchmark collection for hardware design 
and verification [17]. The corresponding /x-Chart specification is shown in Figure 
7. Note that although we used the variable name X in four different automata, 
all variables with this name are local with respect to the automaton in which 
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they occur. As a consequence, as all variables are local, we do not have to require 
Bernstein’s condition to be fulfilled in case of parallel composition. 



0 

1 
4 

6 ^ 

Figure 6. Seven segment display 
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The stopwatch is controlled by two buttons: “start/stop” (s) and “reset” (r). 
Table 2 provides an overview over the signals in use. The watch is started 
by pressing the start/stop button. Then the display shows the time that has 
elapsed, where the rightmost display (Lo) shows tenths of seconds and the re- 
maining two (Med and Hi) show seconds and tens of seconds, respectively. Thus, 
the watch can display times from 00.0 up to 99.9 seconds. 



Signal Meaning 


Source 


s 


Start/Stop 


External 


r 


Reset 




time 


Tenths of seconds Internal 


med 


Seconds 




low 


Tens of seconds 





Table 2. Signals used in the stopwatch 



The stopwatch is driven by a 1 MHz external clock, hence every 100,000 clock 
pulses the lowest digit on the display increases by one, if it is less than nine; 
otherwise, it is reset to zero, and the medium digit increases by one. When 
the reset button is pressed, the watch returns to its Off state, and the display 
shows 00.0. If however, instead of the reset button, the start/stop button is 
pressed again, the watch stops counting until this button is pressed again. The 
stopwatch in Figure 7 is modeled with five sequential automata: S stopwatch, 
Srimer, Sni, SmccI, and Slow- They are composed as follows (the textual syntax 
of the subsequent example is explained in the following sections): 



Dec S stopwatch fiy QStopwatch 

Qstopwatchi^^^') — NoDec 
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Qstopwatchi^^') — Feedback(And j-) 
Soigits = f eedback{And{SHi, SMed, Slow), {med, low}) 



The watch contains two basic control states: On and Off. The former is de- 
composed into two parallel components, namely display and timer: 



Andies jjisplay , STimer) 

There are two feedback constructions for broadcast communication: one around 
Soispiay, and one around the decomposed On state. 

The external clock signal does not appear in the specification. Instead, we 
assume that it is this clock signal that causes the steps of a system, and thus 
determines the time model for the specification. 



3.2 Parallel Composition 

If 5i and S 2 are elements of the set S then their parallel composition denoted 
by the syntax 



And { 81 , 82 ) 

is in S, too. There are no syntactic restrictions on this composition. In the 
graphic notation parallel components are separated by splitting a box into com- 
ponents using dashed lines [12]. 

In our framework, parallel composition does not imply broadcast communication 
between the subcharts. Both subcharts operate independently; communication 
is introduced by an explicit feedback operator (see Section 3.3). 

Informally, the parallel composition of /i-Charts behaves as 5i and S 2 in syn- 
chrony. Generated signals of the parallel components are joined. The parallel 
composition is commutative and associative. We therefore write And (5i, . . . , 5„) 
to denote n € IN nested parallel /i-Charts. 



3.3 Broadcast Communication 

Parallel composition is used to construct independent, concurrent components. 
To allow interaction of such components, our language provides a broadcast 
communication mechanism. In [12], for example, this mechanism already is inte- 
grated in the parallel composition of Statecharts. There, broadcasting is achieved 
by feeding back all generated signals to all components. This means that there 
exists an implicit feedback mechanism at the outermost level of a Statechart. 
Unfortunately, this implicit signal broadcasting leads to a non-compositional 
semantics. We avoid this problem by adding an explicit feedback operator. 
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In the literature, different semantic views of the feedback mechanism can be 
found [28]. For the deterministic version of our language [21, 24, 27], we provided 
different syntactic constructs with different communication timings. We believe 
that for non deterministic, abstract specifications instantaneous feedback is the 
proper concept, since it is better suited for behavioral refinement. Hence, we 
present only this operator here. 

Suppose that 5 € <S is in an arbitrary /x- Chart and L € p{M) is the set of signals 
which should be fed back, then the construct 

Feedback (S,L) 

is also in S. Graphically, the feedback construction is denoted with a box below 
the /i-Chart S. This box contains the signals L that are fed back. 

Instantaneous feedback follows the perfect synchrony hypothesis of Berry [1]; 
this demands that an action and the event causing this action occur at the 
same instant of time. Therefore, the signals in 2 : generated by chart S are 
instantaneously intersected with the signals L to be fed back and then joined 
with the external signals x. This signal set a; U (2 fl L) is passed to S at the same 
instant. 



3.4 Hierarchical Decomposition 

The concept of hierarchically structuring the state space is essential for State- 
charts. In our Statechart dialect, hierarchy is introduced by replacing states of 
a sequential automaton (the master) with arbitrary charts (the slaves). This 
replacement is expressed by a finite, partial function g, which is defined for those 
states (T of the master that are further decomposed. The decomposition func- 
tion g yields the refining slave-chart. Suppose that Seq{N,Vi, Pd, S ,(T4 ,(t,S) is 
a sequential automaton, then hierarchical decomposition is denoted by 

Dec{N,Vi,Pd,S,(^d,(^,S) by g 

where g : S ^ S. Like other formal Statechart semantics [14, 18, 19], the 
approach presented here has no history states. It is possible to extend our 
semantics along the lines of [21]. Due to space limitations we omit this extension 
here. Throughout this paper, we assume that the slave is always re-initialized 
when leaving it. 



4 Semantics 

In this section, we introduce the transition relation for a /i-Chart. It is defined 
inductively following the syntactical structure of the language. The transition 




370 



J. Philipps and P. Scholz 



relations presented here are based on the semantics as presented in [22]. /i-Charts 
are synchronized by a global, discrete clock. Each transition relation formally 
denotes the relationship between two system configurations, i.e. the set of all 
currently valid control states of all sequential automata between two subsequent 
instants. 



4.1 Transition Semantics 



To define the denotational semantics of transitions we first need a valuation e as 
a total function s : V Z. The set of all valuations is denoted by E. Note that 
£ contains total functions: variables have a defined value at every single time 
point, whereas signals have proper values only when they are present. With this 
background we are able to define the semantic functions: 

All : Aexp -^{E 
Bll : Bexp {E ^ IB) 

C|.] : Com ip{M) x £ ^ p(M) x E) 

where IB = {tt, //}. We define the denotation of an arithmetic expression as 
follows: 



.4|n]e = n 
AlXje = e{X) 

.4|ai binop a 2 ]e = .4|ai]e 6mop.4|a2]e 



The denotation of a Boolean expression is also defined inductively: 



B|true]e = tt 
B|false]e = // 

Bfai equ a 2 ]e = .4|ai]e = .4|a2]e 
B\ai leq a 2 ]e = .4|ai]e < .4|a2]e 
B|not6]e = -iB|6]e 
B\hi and h2\e = B|6i]e A B|62le 

The definition of C|c] for commands c is a bit more complex than the definitions 
of .4|.] and B|.]: 



C|skip](a;,e) = {x,e) 

C\X := al(x,e) = let n = .4|a]e in {x,e[n/X]) 
ClEl{x,e) = (a: U {£;},£) 

C|ci,C2l(a;,e) = C|c2l(C[ci](a;, e)) 

if B|6]e = tt 
otherwise 



C[if6then Cl else C 2 
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Here we write e[n/X] for the function obtained from e by replacing its value in 
X hy n. Note that because of the commutativity of the parallel composition we 
could also have defined C|ci,C2](a;,e) = C|ci](C|c2](a;,e)). 



4.2 Preliminaries 



Steps and System Reaetions. Like other Statechart dialects, /r-Charts is a syn- 
chronous language based on a discrete time model. A run of a /r-Chart system 
consists of a sequence of steps. At each step, the system receives a set of signals 
from the environment. Upon receipt of this input set, the system produces a set 
of output signal, and may change its state. The output signals are assumed to 
be generated in the same instant as the input signals are received. A signal is 
said to be present in a given instant, if it is either input from the environment 
or generated by the system in this instant. Otherwise, it is said to be absent. 



Avoiding Multiple Transitions in one Step. As we deal with instantaneous feed- 
back, more than one transition of different sequential automata can fire simul- 
taneously. However, every single automaton only can make one step in one 
instant, i.e. no two consecutive transitions in a sequential automaton are taken 
in a step. This informal requirement has to be formalized in the automaton’s 
transition relation. Furthermore, we have to ensure that only one branch of a 
nondeterministic choice in an automaton is taken in a step. 

Both restrictions can be ensured using additional signals. For each sequential 
automaton Seq(A', Vi,Pd, X, a, 6 ) we introduce a signal ©jy- Informally, this 
is a copyright on transitions of the automaton signaling that N has already made 
a step. When the signal is not present, the automaton may yet make a transition, 
whereupon it will generate ©jy- If ©n is already present, the automaton has to 
stay in its current state. The need for this signal will become clearer later when 
we introduce broadcast communication. The copyright signals are introduced in 
the following way. Each transition with label c/y of N is modified such that: 

— The trigger condition c is strengthened by conjoining -i©jy to it. 

— The action set y is extended by ©jy. 

We assume all signals ©jy to be disjoint from signals in M and define M© by 
M U {©jy I N e Ident}. 



Negation in Trigger Expressions. Negation in trigger expressions can lead to 
some tricky causality problems. For example, what would be the semantics of a 
transition labeled -m/a? Some Statecharts semantics simply disallow Statecharts 
with causality problems. They require a static analysis of the chart, which might 
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reject charts that do not really have causality conflicts. This is for instance the 
approach taken by Argos [ 18 ] or the reactive programming language Esterel [ 3 ]. 

We handle these conflicts semantically. In case of a causal conflict, the transition 
is simply not taken. We accomplish this through oracle signals that predict the 
presence or absence of a given signal in a step. For each signal a that occurs 
negatively in the trigger of a transition, we introduce a new signal a that replaces 
a in the trigger part of a transition label. We deflne M to be M U {a | a € M}. 
However, oracle signals can cause the following two inconsistencies: 

— A signal a is generated by the system or input from the environment, al- 
though the oracle forecasts its absence. In other words, a is in the signal set, 
but not a. 

— A signal a that is predicted to be present, is neither input nor generated by 
the system. In other words, a is in the signal set, but not a. 

The requirement to avoid these inconsistencies is formally expressed by: 
Consistence{x, y, o) = (Asexuy s e o) A s e x U y) 

where x, y, and o denote the sets of input, output, and oracle signals respec- 
tively. This technique is similar to that used in the bottom-up evaluation of 
logic programs with negation as presented in [ 16 ]. For a detailed discussion of 
this topic the interested reader is referred to [22]. 



4.3 Configurations 

Conflgurations c = (7, e) € C are deflned inductively. Fach conflguration has a 
control part 7 and a data part e. The conflguration of a sequential automaton is 
simply the tuple of its current control and data state. To denote the conflguration 
of an And-chart And {81,82) we use a tuple^(7i,ei,72,£2), where (71,61) and 
(72, £2) are the conflgurations of the parallel components 5 i and 82, respectively. 
The conflguration of Feedback ( 5 , L) is simply the conflguration of 8. 

For hierarchical decomposition we need a slightly more subtle notation. The 
master is decomposed into n =df \ dom g\ slaves, where dom g denotes the domain 
of the partial function g. The conflgurations of these slaves are denoted by 
Cl, . . . , c„, whereas the conflguration of the master is denoted by Cm- The overall 
conflguration of 



Dec{N,Vi,Pd,S,(Td,(T,S) by g 
is then the (n + l)-tuple (cm, ci 



^ Or ((71, £i), (72, £2)) more precisely. However, we omit parentheses whenever the 
meaning is clear from the context. 
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In the sequel, we will formulate the transition relations for every single syntactic 
construct of the /r-Charts language. We have two different categories of predi- 
cates: one for initialization and one for the transition step from one configuration 
to the following. These predicates have the type: 

Inits : C Bool 

Transs '■ C x p{M@) x C x p{M@) x M ^ Bool 

for every /r-Chart S. A predicate Transs{c,x,c' ,y,o) is true whenever the 
current configuration of 5 is c and S can, stimulated by the input signal set 
X, reach the subsequent configuration c' in exactly one instant while producing 
the output signal set y. The set o includes those oracles that are needed for the 
treatment of negative signals in S. 



4.4 Sequential Automata 

Initially, a sequential automaton S =df Seq(A, V/ , ( 3 d, B, <74, cr, < 5 ) is in its default 
configuration {ad, ( 3 d)- For an input signal set x coming from the environment, 
S generates a set y of output signals and changes its configuration from (7, e) 

to (7',e'): 

Initsih,e)) = (7 = <Td) A VA G Vi : e{X) = ( 3 d{X) 

Transsii^, s), x, (7', e'), y, o) = Bcom G Com : (7', com) G S(j, x U o)A 

(y,e') = C|coTO](a;,e) 

Note that we omit parentheses for tuples whenever the context allows. 



4.5 Parallel Composition 

The tuple (ci,C2) is the initial configuration of chart S =df And {Si,S2) when- 
ever Cl, C2 are the initial configurations of charts 5 i, S2, respectively: 

Inits{{ci,C2)) = Inits-,{ci) A Inits-^{c2) 

The formal semantics is defined by the following case distinction, which yields 
three mutually exclusive cases. An And-chart can perform a step when at least 
one of the subcharts makes a step. 

Trans s{{c\,C2),x, (c'j , Cj), y, o) = 

{ 3 yi , y2 -Transsi (ci , x, c[ ,yi,o)A 

Transs2{c2,x,C2,y2,o) Ay = yi Uy2)V 

( / 3 f /2 , c.T ranss2 {c2,x, c, 1/2 , o) A 

Transsi{ci,x,c{,y,o) A Cj = C2)V 

( ^yi , c.T ranssi {ci,x, c, yi,o)A 

Trans s2{c2,x,c'-2,y,o) A c{ = ci) 
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The first conjunction represents the case when both charts and S 2 can react 
in their current configurations ci and C 2 on the current signals x. In this case the 
overall reaction is simply denoted by the logical conjunction of both transition 
predicates Trans Si and Trans S 2 ■ The other two conjunctions are true whenever 
only one of 5i or S 2 can react on the current stimuli in its current configuration. 
Should none of the three terms be true, the overall transition predicate Trans s 
is false, i.e. S cannot react at all. 



4.6 Hierarchical Decomposition 

A decomposed chart S =df Dec{N,Vi, Pd, S ,(74,(7, S) by g is in its initial config- 
uration iff the master A =df Seq(A', V/ , /3d, A, (t^, cr, <5) and all existing n slaves 
dom g =df {(7i, . . . , a„} C S are in their initial configurations: 

Inits{{cm,ci, ■ ■ ■ ,c„)) = InitA<yCm) A A^Li InitsPci) 

where for all / = 1, . . . , n the mapping g is defined by p(<T,) = 5,. To define 
the step relation for the decomposition, we distinguish four mutually exclusive 
cases. The first case occurs whenever the current state Cm {cm = of 

the master is refined by a slave 5, (in this case g{(7m) is defined, i.e. (7m G dom g 
and g((7m) = Si), and both A and 5, can react. All other currently non-active 
slaves keep their current configuration Generated signals of both 

master and active slave are collected: y = ys U ym- Note that whenever the 
transition predicate Trans a of the master is true, the slave is initialized through 
the predicate InitsPc)). This first case is formally denoted by: 

Trans\{{cm,ci,... ,c„),x, (c'^,c),... ,c'J,y,o) = 

^ym,ys,c.TransA{cm,x,c'm,ym,o) A 

am G dom£iA5j = g{am) ^TransSi{ci,x,c,ys,o) A 

y = ys^ymA inits, (c') a A^^i cj = c'j 

Here both master and slave can react on the current set of input stimuli. In this 
case, the master preempts the slave’s reaction. Our semantics deals with weak 
preemption: so the slave still can terminate its current action, i.e. generate all 
output signals ys- However, even then it will be re-initialized. 

Whenever the master’s current configuration Cm = {am,£m) is not decomposed 
{am ^ dom g), all slaves stay in their current configurations (A^Li c* = c') and 
only the master itself reacts: 

Trans%{{cm,ci,. . . ,Cn),x, {c'^,c[, . . . ,c'J,y,o) = 

Trans A{cm,x,c'm,y,o) A ami dom A AlLi c* = c' 

If, however, a slave exists but is not able to make a step, again only the master 
reacts but now the current slave 5, is initialized and all other slaves do not 
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change their configuration: 

Trans%{(cm,ci,... ,Cn),x,(c'^,c[, . . . ,c'J,y,o) = 

TransA{cm,x,c'^,y,o) Aam G dom^iASj = g{cm.) A 

^ys,c's -Trans Si (ci,x,c'^,ys,o) A Initsi (c'i) A cj = c'j 

Finally, if the master cannot react, but the current slave 5, can, we have: 

Tran4((cm,ci,... ,c„),a;, ,c'J,y,o) = 

^Vm, c'^ -Trans A{cm,x,c'^,ym,o) A 
am G dom £ 1 A = g{am) A /\-^. cj = c'j A TransSi{ci,x,c[,y,o) 

Overall, the complete transition relation is the disjunction of these cases: 



Transs{{cm,ci,. . . 


5 C«), 








),y,o) = 




Trans g 


((^m : 


,Ci, . . 


• 7 ) 7 


^7 5 


c'l. 


d ) 

• • • •) '-'n/f 


y,o)V 


Trans“g 


((^m : 


,Ci, . . 


• 7 ) 7 


^7 i^m 5 


c'l. 


c' ) 

• • • •) '-'n/f 


y,o)v 


Trans\ 


((^m : 


,Ci, . . 


• 7 ) 7 


^7 i^m 5 


c'l. 


c' ) 

• • • •) '-'n/f 


y,o)v 


Trans g 


((^m : 


,Ci, . . 


• 7 ) 7 




c'l, 


■ . . , c„). 


y,o) 



The predicate Trans s is false iff neither master nor slave can react to the current 
input. 



4.7 Broadcast Communication 

The initialization predicate for S = Feedback(i?, L) is defined as: 

Inits{c) = Initjiic) 

The transition relation Trans s is built up from a number of auxiliary predicates. 
As we deal with a chain reaction when defining the semantics of the instantaneous 
feedback, we have to fix the termination of this reaction. It terminates when in 
the current configuration c the chart S cannot react any more on the current 
input stimuli x: 



Termsic, x, o) = ^y, c' .Transn{c, x, c' ,y, o) 

The predicate Cones constructs the set of all intermediate points in the chain 
reaction by the yu-calculus formula: 

Cones{c,x,c' ,y,o) = 

jjAP .{Trans B,{c, x, c' ,y, o) V 

3x ' , y ' , y" ,c".<F (c, x,c", y" ,o) AT ransR (c" ,x' ,c' ,y' ,o) A 
x' = xU {y" n L) A y = y' U y”) 
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In order to verify whether Cones{c,x,c' ,y,o) yields true we have to verify 
whether either of the two following cases is true. The first alternative is that c 
and c' represent two subsequent configurations, i.e. c' is reachable from c in one 
step: TransE{c,x,c' ,y,o). Otherwise, it has to be verified whether c and c' can 
be reached via an intermediate configuration c". All reachable configurations 
from c are computed by applying the least fixpoint operator ji on predicate 
Note that the external stimuli x are available during the whole chain reaction and 
that only those internal signals which occur in L can be fed back: x' = xU {yf]L). 
The overall transition relation of S is then defined as: 



Transs(c,x,c' ,y,o) = 

Cones{c, x,c' ,y,o) A Terms{c' ,x U {y Ci L),o) A C onsistence(x , y, o) 



As already mentioned, the oracle signals in o nondeterministically predict the 
absence or presence of signals in a step. This prediction is needed for the proper 
treatment of negative trigger expressions in sequential automata. Of course, 
such a guess might lead to inconsistencies, if in fact a signal predicted to be 
present is neither input from the environment, nor generated by the system, or 
vice versa. Such inconsistencies are detected with the predicate Consistence 
defined in Section 4.2. They can only occur with instantaneous feedback of a 
signal that can be generated in one subchart, and whose absence is checked in 
another subchart. 

The transition relations defined above are partial. When a chart cannot react 
to its current input, the relation is undefined. Intuitively, in this case however 
the chart should stay in its current configuration. The execution of a chart S is 
therefore defined over the following, total step relation: 



Steps{c,x,c' ,y) = 

3o.Transsic,x,c' ,y,o) V 

Vc", y” ,o.^Transs{c, x, c" ,y" , o)Ac = c'Ay = 0 



For each chart, there is only a finite number of control states and a finite set 
of input and output signals. Thus, this transition relation can be efficiently 
represented by BDDs [6]. BDDs have a long tradition in the fields of logic 
minimization, test pattern generation, and symbolic model checking as we show 
in the next section. 

A finite characterization of the initial state of a yu-Chart, Inits{c), can be derived 
similarly. Thus, the semantics for every component of a system specification is 
represented by two BDDs: one characterizes the initial states of the component, 
and the other the step relation. Since the semantics of /i-Charts is compositional, 
the BDDs for each chart can be computed from the BDDs of its subcharts. 
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5 Formal Verification 

In the last section, we defined the semantics of /r-Charts in terms of the rela- 
tional /t-calculus. Recently, several verification tools for this calculus have been 
developed [4, 7]. For our work, we used the /t-calculus model checker /t-cke. In 
this section, we show how a specification in /r-Charts can be encoded in /t-cke, 
and how the encoding can be used for property verification. 



5.1 The Verifier /j.-cke 

The input language of p-cke is similar to the programming language C. Currently, 
the supported data types include Booleans, enumerations, bounded intervals of 
the natural numbers, and algebraic products that are defined with a construct 
“class” that is similar to “struct” in C. 

Relations are defined as boolean- valued functions. In the /i-calculus, the least 
and greatest fixpoint operators are used to recursively define relations. As in C, 
p-cke permits recursion by writing the relation’s name in the right-hand side of 
the relation’s definition. In addition, the keywords mu or nu have to precede the 
definition to indicate whether the least or greatest fixpoint is intended. 

Currently, p-cke does not support non-Boolean functions or types corresponding 
to the algebraic sum (union in C). 

Internally, p-cke uses BDD algorithms to evaluate formulas. The variable order- 
ing for the BDDs is computed automatically, but it is possible to give “hints” 
in the input files about the ordering and interleaving of variables. There exist 
commands to arrange variables in sequential (x < y) or interleaved (x ~+ y) 
ordering. 



5.2 Encoding 

We implemented a small translator from /^-Charts to the input language of p-cke. 
The translator is written in Perl [29], and the definition of a /i-Chart is just a 
file with Perl declarations. The translation is performed top-down following the 
hierarchical structure of the /i- Chart. For each construct a configuration data 
type, an initialization predicate, and a transition relation is defined. 

In the following, we present some parts of the translation of the example in 
Section 2.2. 



Signals. For both the external and internal signals that occur in the specification, 
a single type is defined. While two separate types might seem more natural. 
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the single type helps with the variable ordering used by yu-cke, and makes the 
internal representation more efficient. Note that for each sequential automaton a 
copyright signal is introduced. Whenever a signal s is present, the corresponding 
Boolean variable Vg equals to true, otherwise to false. 



class Signals { 

/* Internal and external signals: */ 



bool crash; 
bool ignition; 
bool rmr; 
bool lup; 

/* Copyright signals: */ 
bool CR_Control ; 
bool CR_Lockg; 
bool CR_MotorLeft ; 
bool CR_DownLeft; 
bool CR_UpLeft; 



bool 


0 ; 


bool 


c ; 


bool 


ready ; 


bool 


Imr 


bool 


Idn; 


bool 


rdn 


bool 


rup ; 







bool CR_Normal ; 
bool CR_Unlg; 
bool CR_MotorRight ; 
bool CR_DownRight ; 
bool CR_UpRight; 



Oracles. The locking system contains two internal signals, Imr and rmr (see 
automata for locking and unlocking), that are referenced negatively. Thus, we 
need two oracle variables. The translator introduces a new type for these oracles, 
and defines a predicate for consistence: 

class Oracles { 
bool Imr; 
bool rmr; 

}; 



bool Consistence (Signals i, Signals o, Oracles ore) 
( 

((i.lmr I o.lmr) -> ore. Imr) & 

((i.rmr I o.rmr) -> ore. rmr) & 

(ore. Imr -> (i.lmr | o.lmr)) & 

(ore. rmr -> (i.rmr I o.rmr)) & 

); 



Sequential automata. For each sequential automaton, two state types are in- 
troduced. One, the vertex, is an enumeration type that represents the control 
state. The second type is a product type for the chart’s complete configuration. 
In addition to the control state, it contains the data state components. Since 
the locking system is specified without data states, this type consists of just the 
vertex. 



In addition, two predicates are defined: 
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— The first predicate characterizes the initial configuration of the automaton. 
Initial configurations restrict the control state to the default state of the 
automaton, and could also restrict the data components, if used. 

— The second predicate defines the transition relation over the current config- 
uration s, the successor configuration t, input and output signal sets i and o, 
and the oracle set arc. The relation is a disjunction of the individual tran- 
sition conditions, together with a condition that restricts the automaton’s 
outputs. 



Note the three expressions after the transition relation’s header: these are vari- 
able ordering hints for /x-cke. They indicate that the variables for current and 
successor configuration, as well as the input and output signal variables should 
be interleaved. Moreover, the input (and thus also the output) variables come 
strictly before the configurations. Nothing is said about the oracle variables — 
thus, the checker is free to use its heuristics for their positioning. 



enum MotorleftVertex { 

Down_Motorleft , Off _Motorleft , Up_Motorleft 

}; 

class MotorleftState { 

MotorleftVertex c; 

}; 



bool Motorleftinit (MotorleftState s) 

(s.c = Off_Motorleft) ; 

bool MotorleftTrans (MotorleftState s, Signals i, MotorleftState t, 
Signals o, Oracles ore) s"+t, i"+o, i<s 

( 

((s.c = Off _Motorleft & t.c = Down_Motorleft & 
i.ldn & ! i . CR_Motorleft & o . CR_Motorleft) | 

(s.c = Off _Motorleft & t.c = Up_Motorleft & 
i.lup & ! i . CR_Motorleft & o . CR_Motorleft) I 
(s.c = Off _Motorleft ft t.c = Down_Motorleft ft 
! i.lup ft ! i.ldn ft ! i . CR_Motorleft ft o . CR_Motorleft) I 
(s.c = Off _Motorleft ft t.c = Off _Motorleft ft 
i . CR_Motorleft ft ! o . CR_Motorleft) | 

(s.c = Down_Motorleft ft t.c = Off_Motorleft ft 
i. ready ft ! i . CR_Motorleft ft o . CR_Motorleft) | 

(s.c = Down_Motorleft ft t.c = Down_Motorleft ft 
!i. ready ft ! i . CR_Motorleft ft o . CR_Motorleft) | 

(s.c = Down_Motorleft ft t.c = Down_Motorleft ft 
i . CR_Motorleft ft ! o . CR_Motorleft) I 
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(s.c = Up_Motorleft & t.c = Off _Motorleft ft 
i. ready ft ! i . CR_Motorleft ft o . CR_Motorleft) I 
(s.c = Up_Motorleft ft t.c = Up_Motorleft ft 
!i. ready ft ! i . CR_Motorleft ft o . CR_Motorleft) I 
(s.c = Up_Motorleft ft t.c = Up_Motorleft ft 
i . CR_Motorleft ft ! o . CR_Motorleft) ) 

/* Besides CR_Motorleft no other other output is possible: */ 

ft ! crash ft !o ft left ! ignition ft ! ready ft llmr ft Irmr 
ft ! Idn ft Irdn ft llup ft Irup ft ! CR_Control ft !CR_Normal 
ft !CR_Lockg ft ! CR_Unlg ft ! CR_MotorRight ft !CR_DownLeft 
ft ! CR_DownRight ft ! CR_UpLef t ft ! CR_UpRight 

); 



Parallel composition and hierarchical decomposition. For brevity we skip the 
remaining two constructs of /x-Charts. Given the semantics of Section 4, the 
translation is straightforward, but the resulting code is quite lengthy. 



Broadcast communication. The configuration data type and the initialization 
predicate for the feedback operator are identical to those of its argument p- 
Chart. 



class TopState { 

ParallelState s; 

}; 



bool Topinit (TopState s) 

Parallellnit (s . s) ; 



The transition relation for the feedback operator is defined in terms of an aux- 
iliary relation cone that defines the chain reactions. This relation is defined as 
a least fixpoint construction; a chain reaction is either a single transition of the 
argument /i-Chart, or a transition from a state already reached in the chain 
reaction. Here the input and output signals have been copied; this is the logical 
notation of the set-based definition in Section 4. 



mu bool TopCone (TopState s, Signals i, TopState t, 

Signals o, Oracles ore) s"+t, i"+o, i<s 

/* Base: */ 

ParallelTrans (s . s , i ,t . s ,0 ,orc) | 
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/* Microstep: */ 

(exists Signals oo, Signals ooo, Signals ii, TopState tt . ( 
TopCone (s , i ,tt , 00 ,orc) & 

ParallelTrans (tt .s,ii,t.c,ooo,orc) ft 

(ii . CR_Control <-> (i . CR_Control | oo . CR_Control) ) ft 

(ii. crash <-> (i. crash)) ft 



(o.rdn <-> (oo.rdn I ooo.rdn)) ft 
(o.lup <-> (oo.lup I 000. lup)) ft 
(o.rup <-> (oo.rup | ooo.rup))); 



The transition relation checks in addition that the configuration c' reached in 
the cone predicate is terminal, i.e. Terms{c' ,x U (y fl L), o), and that the signal 
sets built up while reaching them are consistent. 



bool TopTrans (TopState s, Signals i, TopState t, 

Signals o, Oracles ore) s"+t, i"+o, i<s 
/* Reachable, */ 

TopCone (s , i ,t ,0 ,orc) ft 

/* yet terminal configuration: */ 

! (exists Signals ii, Signals oo, TopState tt . 
(ParallelTrans (t . s , ii ,tt . s , oo , ore) ft 
(ii . CR_Control <-> (i . CR_Control | o . CR_Control) ) ft 
(ii. crash <-> (i. crash)) ft 



(ii.rup <-> (i.rup | o.rup)))) ft 
Consistence (i , o , ore) ; 

Again, the predicate definitions are annotated with hints for the variable order- 
ing. 



System reactions. We have seen how the step semantics of a /i-Chart can be 
encoded in /x-cke. For verification, however, we need the behavior at the level of 
system reactions. Prom the transition semantics we obtain the reaction semantics 
by existentially quantifying the oracles. Thus, we implicitly handle all guesses. 
If there is no oracle such that there is a valid transition, the /^-Chart’s successor 
configuration is defined to be equal to the current configuration, and all output 
signals are absent. 
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bool LockingSystemReact (LockingSystemState s, Signals i, 
Signals o, LockingSystemState t) s "+ t 
(exists Oracles ore. 

! i . Imr ft ! i . rmr ft ! i . Idn ft ! i . rdn ft ! i . lup ft ! i . rup ft 
! i . CR_Control ft !i.CR_Normal ft ! i . CR_Doorstate ft 
! i . CR_Motorleft ft ! i . CR_Motorright ft 
LockingSystemTrans (s , i , ore , o , t) ) I 

(s = t ft !o. crash ft ... ft ! o . CR_Motorright) ; 



5.3 Model Checking 

For property verification, we also formulate the property to be proven in the rela- 
tional /i-calculus. Simple properties such as invariants can be straightforwardly 
expressed in /x-calculus; others are usually easier expressed in the more readable 
temporal logics CTL, CTL* or LTL, and then schematically translated into 
the /i-calculus. A translation of the branching-time logic CTL into /i-calculus is 
described in [20], translations of CTL* and also LTL can be found in [8]. 

Here we just look at the simple verification whether a given configuration is 
reachable. Reachability can be defined as a least fixpoint with the following 
formula: 

mu bool reachable (LockingSystemState s) 

LockingSystemlnit (s) I 

(exists Signals i, Signals o, LockingSystemState r. 
reachable (r) ft 

LockingSystemReact (r , i , o , s) ) ; 



A similar formula with a largest fixpoint operator nu could be used to check for 
invariance of a property. 

The /i-cke encoding can also be used to check a specification for determinism. 
In /i-cke, this condition is formulated as follows: 

forall Signals i, Signals ol, Signals o2 , 

LockingSystemState s, tl, t2 . 

(reachable (s) ft 

LockingSystemReact (s , i , ol ,tl) ft 
LockingSystemReact (s , i , o2 ,t2) ) 

-> (ol = o2 ft tl = t2) ; 

Note that here we first calculate the reachable configurations of the specification; 
thus, our condition is stronger and more precise than the syntactic determinism 
checks of, say, the Argos compiler. On the other hand, the calculation of reach- 
able states can be computationally expensive for larger specifications, especially 
when local data states are used. 




Formal Verification and Hardware Design with Statecharts 



383 



6 Hardware Generation 



In this section we present a hardware implementation scheme for /r-Charts. In 
contrast to the tool Statemate [15], we aim at a direct implementation, and not 
at compilation to VHDL code. Two previous approaches for direct hardware 
implementations of a Statecharts-like language were presented by Drusinsky in 
[9, 10]. The first one implemented a Statechart as a network of communicating 
finite state machines. Especially for small and medium-sized systems this scheme 
introduces considerable communication overhead; in addition, the author admits 
that it is difficult to implement correctly. 

In the second approach, he suggested realizing a Statechart as a single logic 
block. However, it is not clear how this approach scales to larger specifications. 
Neither of these two approaches is based on a formal semantics. Of course, in 
principle, a very low-level semantics is given in form of hardware logic. However, 
it is not possible to formally verify properties like compositionality of this type 
of semantics. Properties of this kind are, for instance, important if one wants 
to support the designer with formal refinement rules that help her or him in 
incrementally developing a specification [26]. Moreover, neither approach allows 
for specifying systems with data states. 

Our implementation scheme is based on the formal semantics introduced in the 
previous sections; in particular, implementations are generated from the same 
transition relations that are used for formal verification of /i- Charts with sym- 
bolic model checkers. While our scheme also avoids the communication overhead 
of communicating components, the implementations can naturally be divided 
into smaller logic blocks, each of them responsible for the value of a single out- 
put signal bit. 

As synchronous hardware allows only deterministic implementations, we restrict 
ourselves to deterministic /i-Charts in this chapter. With the techniques of 
Chapter 5 it is easy to check that a /x-Chart is deterministic. 



6.1 Symbolic Encoding of ^-Charts 



In Chapter 4 we presented the formal step semantics of /i-Charts. It has been 
shown that the step semantics of a given /x-Chart S can be described as a finite 
transition relation 



Steps(x,c,y,c) 

where x and y are finite encodings of the input x and output signals y. The 
current and next system configuration are encoded in c and c', respectively. A 
similar predicate Inits{c) for initialization has been formulated. 
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These predicates are used for the hardware implementation according to Fig- 
ure 8. The input to the logic block consists of the external inputs x, the finite 
encoding of the current system state (including control c and data state e), and 
a reset wire. The logic block has system output y and the new system state 
encoding c' as output. In addition to this combinational logic block, we need a 
state register for the control and data states and an external clock that triggers 
the /i- Chart steps. 



reset 

X 





State- 

Register 

( 















elk 



Logic 



y 



Figure 8. Hardware implementation scheme 



The logic block is derived in the following way. The transition relation Steps is 
converted to a family of Boolean functions, one for each output signal yi, and 
one for each bit in the encodings of the control and data states. The Boolean 
functions for a bit are derived from the transition relation Steps by existentially 
quantifying the other output signals. This operation can, for instance, be effi- 
ciently carried out with BDDs. The conversion to Boolean functions is possible, 
since for hardware generation we restrict ourselves to deterministic yu-Charts. 
The check for whether a /i-Chart is deterministic can also be performed on the 
transition relation Steps- 

As shown in Figure 9, the individual Boolean functions can be implemented 
separately. In this figure, thick lines represent bundles of signals, whereas single 
bit connections are drawn as thin lines. 

Each Boolean function for an output signal contains an abstraction of the com- 
plete specification. This is the reason why our implementation scheme does 
not require explicit communication between the logic blocks. The abstraction 
contains those aspects of the complete system specification that are needed to 
calculate the signal’s value — neither more nor less. This is the reason why 
the individual Boolean functions can be represented with comparatively small 
BDDs, much smaller than the complete transition relation Steps- 
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Figure 9. Distributed implementation 



6.2 Stopwatch Example 

In this section, we apply our implementation technique to the stopwatch ex- 
ample. Since only a finite subset of integers is used for the data states in the 
example, we can encode them with BDDs. For the clock counter in the subchart 
Timer, we employ the standard binary encoding for integers between 0 and 
100,000; this requires 17 bits. We could represent the ten values for the digit 
displays using d-bit integers, but instead we have chosen a direct encoding of 
the seven segments for each digit. This allows us to use a slight variation of the 
implementation scheme: instead of dedicated output signals for the segments, 
we use the state encoding itself to drive the display. 

Since the watch consists of five sequential automata, two parallel compositions, 
and two feedback constructions, there are nine initialization and nine transition 
relations. The top level relations, that are those of Dec Sstopwatch by Qstopwatch 
are the ones used for hardware generation. The transition relation is then con- 
verted to a family of Boolean functions. 
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Figure 10. BDD for segment 0 of display Lo 



Figure 10 shows the BDD for segment 0 of the lowest digit. With 66 nodes, the 
BDD is quite small. It is not possible, in general, to predict the BDD sizes for a 
given /r-Chart. We can, however, calculate the number of bits needed to encode 
the data and control states, and thus the width of the state register: 

hits(Seq{N,Vi,l3d,E,ad,a,S)) = \ld(\E\)'] + \ld{ ^ range(X))'] 

xev, 

bits{And (5'i,5'2)) = bits{Si) + bits{S 2 ) 

6its(Feedback {S,L)) = bits{S) 

bits{Dec{N,Vi,Pd,S,(Td,(T,S) by g) = bits{Seq{N,Vi, Pd, ^ ,(Td,(T,S)) + 

max{bits{g{a)) \ a G E A g(a) ^ NoDec} 

In the stopwatch example we have ld{2) = 1 fiipfiop for the automaton S stopwatch, 
W(4) + rW(10®)l = 2 + 17 = 19 for Srimer, and 3 • (7 + 1) = 24 for Soigits- As 
mentioned, we directly use the encoding of the local states to control the seven 
segment displays and so avoid additional logic to decode the logic for each seg- 
ment out of a four bit state. Altogether we need a 44 bit wide register. 

Referring to Figure 9 again, the register width is the constant m, in our example 
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44. The constant k is the number of external input signals. In the stopwatch 
example, k = 2 since the only inputs are for the two buttons on the watch. 
Finally, n is the number of output signals. In the example, n should be 3 • 7 = 21 
for the three seven segment displays. However, since we directly use the state 
encoding as outputs for the display, our example gives n = 0, since there are 
no other outputs of the system. Thus, for the example 44 HDDs have to be 
implemented. Note that low, med, and time are internal signals for broadcasting 
only. Their use is implicitly modeled in the Boolean functions. 



7 Conclusion 

The Statechart dialect presented in this paper offers instantaneous feedback and 
nondeterminism. We have shown how to deal with both concepts formally and 
demonstrated that model checking for specifications with instantaneous chain 
reactions is possible. We demonstrated our approach by an example. The re- 
sults presented in Section 5 show that it is difficult to place trust in a formal 
specification without proving central system properties. 

We expect that in our framework larger specifications can be verified than in 
approaches without instantaneous feedback. The reason is that in the feedback 
definition intermediate configurations that occur only during chain reactions are 
hidden through the fixpoint construction. With other communication mecha- 
nisms, these intermediate configurations remain visible. Moreover, we believe 
that specifications with instantaneous broadcasting are more concise than those 
written in e.g. the Statemate dialect. 

Finally, it remains to be seen whether BDD-based symbolic verification tech- 
niques are indeed the best approach for model checking /^-Charts. For instance, 
in our example only 22 configurations are reachable. It is possible that non- 
symbolic techniques are more efficient for /x-Chart specifications. However, the 
high-level input language of yu-cke turned out to be very helpful for rapid proto- 
typing of our language definition, semantics, and verification approach. 

Furthermore, we have presented a hardware implementation scheme for /^-Charts. 
Our hardware implementation scheme is directly based on the formal semantics 
of /i-Charts. This semantics rests upon the same semantics as the one used for 
model checking. When local data states are restricted to finite types, the step 
relations can be translated immediately into HDDs. These HDDs can be im- 
plemented as combinational logic. As HDDs are based on the Shannon normal 
form (SNF) and some programmable hardware units like field programmable 
gate arrays are based on the disjunct normal form (DNF) it may be useful to 
transform HDDs to DNF before implementing them on hardware. 

In addition to the logic, only a register and an external clock is needed. The 
register size can be calculated a priori. 
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Future work will focus on the further development of a formal refinement calculus 
[26] that supports correctness by construction, on software generation, and on 
hardware/software codesign. 
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Abstract. This paper is an attempt to demonstrate the potential of con- 
ditional refinement in step-wise system development. In particular, we 
emphasise the ease with which conditional refinement allows bounded- 
ness constraints to be introduced in a specification based on unbounded 
resources. For example, a specification based on purely asynchronous 
communication can be conditionally refined into a specification using 
time-synchronous communication. 

The presentation is built around a small case-study: A step-wise design 
of a timed FIFO queue that is partly to be implemented in hardware 
and partly to be implemented in software. We first specify the exter- 
nal behaviour of the queue ignoring timing and synchronisation. This 
overall specification is then restated in a time-synchronous setting and 
thereafter refined into a composite specification consisting of three sub- 
specifications: A specification of a time-synchronous hardware queue, a 
specification of an asynchronous software queue, and a specification of 
an interface component managing the communication between the first 
two. We argue that the three overall specifications can be related by 
conditional refinement. By further steps of conditional refinement ad- 
ditional boundedness constraints are introduced. We explain how each 
step of conditional refinement can be formally verified in a compositional 
manner. 



1 Introduction 

Requirements imposing upper bounds on the memory available for some data 
structure, channel, or component are often programming language or platform 
dependent. Such requirements may for instance characterise the maximum num- 
ber of messages that can be sent along a channel within a time unit without 
risking malfunction because of buffer overflow. Clearly, this number may vary 
from one buffer to another depending on the type of messages it stores, and the 
way the buffer is implemented. 

One way to treat such boundedness constraints is to introduce them already 
at the most abstract level — in other words, in the requirement specification. 
However, this option is not very satisfactory, for several reasons. The bounded- 
ness constraints: 

— considerably complicate the specifications; as a result, the initial understand- 
ing of the system to be designed is often reduced; 
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— reduce the set of possible implementations; this makes the reuse of specifi- 
cations more difficult; 

— complicate formal reasoning; this means it becomes more difficult to verify 
the correctness of refinement steps; 

— are often not known when the requirement specifications are written; for 
instance, at this stage in a system development, it is often not decided which 
implementation language (s) to use and on what sort of platform the system 
is to run. 

We conclude that any development method requiring that boundedness con- 
straints are introduced already in the requirement specification, is not very use- 
ful from a practical point of view. However, since any computerised system is 
based on bounded resources, we need concepts of refinement supporting the 
transition from specifications based on unbounded resources to specifications 
based on bounded resources. This paper advocates the flexibility of conditional 
refinement for this purpose. We define conditional refinement for a specification 
language based on streams. A specification in this language is basically a relation 
between input and output streams. Each stream represents a complete commu- 
nication history for a channel. To formalise timing properties we use what we 
call timed streams. Timed streams capture a discrete notion of time. The under- 
lying communication paradigm is asynchronous message passing. The ideas of 
this paper can be adapted to other specification languages, as well. For example, 
[14] formulates a notion of conditional refinement in SDL (see also [7]). 

The remainder of the paper is organised as follows: In Sect. 2 we introduce 
the basic concepts and notations; in Sect. 3 we give three different specifications 
of a FIFO queue; in Sect. 4 we formally define conditional refinement and explain 
how the specifications from Sect. 3 can be related by this concept; in Sect. 5 we 
conditionally refine the FIFO queue specified in Sect. 3 into a more concrete 
specification with additional boundedness constraints; in Sect. 6 we give a brief 
summary and relate our approach to other approaches known from the literature; 
in App. A we formulate three rules that can be used to verify the refinement 
steps mentioned above; in App. B we prove that these three rules are sound. 



2 Basic Concepts and Notations 

Streams model the communication histories of channels. We use both timed and 
untimed streams. A timed stream is a finite or infinite sequence of messages 
and time tieks. A time tick is denoted by i/. The time interval between two 
consecutive ticks represents the least unit of time. A tick occurs in a stream at 
the end of each time unit. 

An infinite timed stream represents a eomplete eommunieation history, a 
finite timed stream represents a partial eommunieation history. Since time never 
halts, we require that any infinite timed stream has infinitely many ticks. 

By M-, M- and M— we denote respectively the set of all infinite timed 
streams, the set of all finite timed streams, and the set of all finite and infinite 
timed streams over the set of messages M . 
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We also work with streams without ticks, referred to as untimed streams. By 
M°°, M* and we denote respectively the set of all infinite untimed streams, 
the set of all finite untimed streams, and the set of all finite and infinite untimed 
streams over the set of messages M . In the sequel, M denotes the set of all 
messages. Since 1/ is not a message, we have that 1/ ^ M. 

By N we denote the set of natural numbers; by Nqo and N-|_ we denote NU{oo} 
and N\{0}, respectively. Given A C streams r,s € , t € M— 

and j e Noo : 

— 0 denotes the empty stream. 

— #r denotes the length of r: 00 if r is infinite, and the number of elements in 
r otherwise. Note that time ticks are counted. For example, the length of a 
stream consisting of infinitely many ticks is 00. 

— r.j denotes the jth element of r if 1 < j < #r. 

— (fli, «2, .., a„) denotes the stream of length n, whose first element is ai, whose 
second element is «2, and so on. 

— © r denotes the result of filtering away all messages (ticks included) not 
in A. For example: 

{a,6}©(a, 6,1/, c,i/, a,i/) = (a,b,a) 

— r ^ s denotes the result of concatenating r and s. For example: (c, c) ^ 
(a, b) = (c, c, a, b). If r is infinite then r^s = r. 

— denotes the result of concatenating j copies of the stream r. 

— r C s holds iff r is a prefix of or equal to s . 

— t\,j denotes the prefix of t characterising the behaviour until time j; this 
means that t\,j denotes t if j is greater than the number of ticks in t, and 
the shortest prefix of t containing j ticks, otherwise. Note that = t, and 
also that = (). 

— t denotes the result of removing all ticks in t. Thus, (a,i/, 6, 1/) ^ (1/)°° = 
(a, b). 

A timed infinite stream is time- synchronous if exactly one message occurs in 
each time interval. Time-synchronous streams model the pulsed communication 
typical for synchronous digital hardware working in fundamental mode. Since, 
in this case, exactly one message occurs in each time interval, no information is 
gained from the ticks; the ticks can therefore be abstracted away; the result is an 
untimed infinite stream. To facilitate time abstraction in the time-synchronous 
case, the 1 operator is overloaded to untimed infinite streams: For any stream 
s e M°° and j € Noo , slj denotes the prefix of s of length j. 

3 Specification of a FIFO Queue 

Elementary specifications are basically relations on streams. These relations are 
expressed by formulas in predicate calculus. This section introduces three for- 
mats for elementary specifications: The time-independent, the time- dependent 
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and the time- synchronous format. We first specify the external behaviour of 
a FIFO queue in the time-independent format, a format tuned towards asyn- 
chronous communication. This specification is thereafter reformulated in the 
time-synchronous format; this format is particularly suited for the specifica- 
tion of digital hardware components communicating in a synchronous (pulsed) 
manner. Thereafter, we give a composite specification of the FIFO queue con- 
sisting of three elementary specifications: A hardware queue written in the time- 
synchronous format, a software queue written in the time-independent format, 
and an interface component written in the time-dependent format; the latter 
manages the communication between the first two. 

3.1 Time-Independent Specification 

The elementary specification below describes a FIFO queue. 

— FIFO - timeJndependent 

II u : G 
out V : D 



asm Req-Ok{u) 



com FIFO-Beh{u,v) 



FIFO is the name of the specification. Ignoring the dashed line, the specification 
FIFO is divided in two main parts by a single horizontal line. The upper part 
declares the input and output channels distinguished by the keywords in and 
out. Hence, FIFO has one input channel u of type G and one output channel v 
of type D. D is the set of all data elements. What exactly these data elements 
are is of no importance for this paper and therefore left unspecified. A request 
is represented by Req. It is assumed that Req ^ D. G is defined as follows: 

G = DU {Req} 

Thus, data elements and request are received on the input channel; the queue 
replies by sending data elements. We often refer to the identifiers naming the 
channels as channel identifiers. 

The lower frame, referred to as the body, describes the allowed input/output 
history. It consists of two parts: An assumption and a commitment identified by 
the keywords asm and com, respectively. The assumption describes the intended 
input histories — it is a pre-condition on the input histories. The commitment 
describes the output histories the queue is allowed to produce when the input 
history fulfills the assumption. Both the assumption and the commitment are 
described by formulas in predicate calculus. In these formulas u and v are free 
variables of type and , respectively; they represent the communication 
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histories of the channels they name. That u and v represent untimed streams 
(and, for instance, not timed infinite streams, as in the time-dependent case) is 
specified by the keyword timeJndependent in the upper right-most corner. 



Assumption The assumption is expressed by an auxiliary predicate 
Req-Ok(a) 

This predicate states that for any prefix b of the untimed stream a, the number of 
requests in b is less than or equal to the number of data elements in b. Formally: 




Thus, the assumption of FIFO formalises that, at any point in time, the number 
of received requests is less than or equal to the number of received data elements; 
in other words, the environment is assumed never to send a request when the 
queue is empty. Note that a is of type U M— and not just (C M“). 
This supports the reuse of Req-Ok in other contexts. Hence, the definition of 
the auxiliary predicate Req-Ok is independent from the specification FIFO in 
the sense that Req-Ok can be exploited also in other specifications. 



Commitment Also the commitment employs an auxiliary predicate 
FIFO^Behia, b) 

This predicate requires that the stream of data-elements sent along 6 is a prefix 
of the stream of data elements sent along a (first conjunct) , and that the number 
of data elements in b is equal to the number of requests in a (second conjunct) . 
Formally: 

FIFO^Beh 

a, 6 G U 

D®b Q (D®a) A © 6 = #({i?eg} © a) 



Thus, the commitment of FIFO requires that the data elements are output in the 
FIFO order, and that each request is replied to by the transmission of exactly 
one data-element. 

The semantics of elementary specifications is defined formally in Sect. 4.2. 
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3.2 Time-Synchronous Specification 

The time-independent format employed in the previous section is well-suited to 
specify software components communicating asynchronously. To describe hard- 
ware applications communicating in a time-synchronous (pulsed) manner, we 
use the time-synchronous format as in this section. We now restate FIFO in a 
time-synchronous setting. Time-synchronous communication is modelled as fol- 
lows: In each time unit exactly one message is transmitted along each channel. 
For the case that the time-synchronous FIFO queue or its environment have no 
“real” message to transmit, a default message Dlt is used. It is assumed that 
Dlt ^ G. We define: 

Goit = G U {Dlt}, Ddu = Du {Dlt} 

FIFO can then be restated in the time-synchronous format as follows. 

= FIFOts time_synchronous 

II i : Gdh 
out 0 : Ddu 



asm Req-Ok{i) 



com FIFO-Beh{i, o) 



The keyword in the upper right corner implies that the channel identifiers in the 
body represent infinite untimed streams instead of untimed streams of arbitrary 
length as in FIFO. The filtration with respect to D in the definitions of the 
two auxiliary predicates of the previous section makes their reuse in FIFOts 
possible. 



3.3 Composite Specification 

We have presented elementary specifications written in the time-independent 
and the time-synchronous format. As already mentioned, there is also a time- 
dependent format. The time-dependent format is tuned towards asynchronous 
communication. It differs from the time-independent format in that it allows 
timing requirements to be expressed. We now use all three formats for elemen- 
tary specifications to describe a composite FIFO queue that is partly to be 
implemented in hardware and partly to be implemented in software. 

As illustrated by Fig. 1, the queue consists of three components, namely a 
hardware queue specified by HWQ, a software queue specified by SWQ, and an 
interface component specified by INTF. The interface component is needed as 
a converter between the time-synchronous hardware component and the asyn- 
chronous software component. 
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Fig. 1. Decomposed FIFO Queue 



The hardware queue has only a bounded amount of internal memory: It can 
never store more than Wh data elements at the same point in time. To avoid 
memory overflow it may forward data elements and requests to the software 
queue. The software queue is in principle unbounded. The external behaviour of 
INTF and SWQ composed should be that of FIFOts- 

Hardware Queue The hardware queue communicates in a time-synchronous 
manner: In each time-unit the hardware queue receives exactly one message on 
each input channel (can be understood as an implicit environment assumption) 
and outputs exactly one message along each output channel. The hardware queue 
is specified as follows. 

— HWQ time-synchronous 

II i : Gdiu y '■ Ddu 
out 0 : Ddu; x : Gdu 



asm Req-Ok{i) A FIFO-Beh{x, y) 



com FIFO-Beh{i, o) A Req-Ok{x) A Bnd-Hwm{i , y, o, x, Wu) 



The first conjunct of the assumption is the same as in FIFOts; the same holds 
for the first conjunct of the commitment. The second conjunct of the assumption 
implies that the hardware queue guarantees correct behaviour only as long as the 
software queue (composed with the interface component) behaves like a FIFO 
queue. The second conjunct of the commitment requires the hardware queue 
to use the software queue correctly, namely by sending requests only when the 
software queue has at least one data element to send in return. The third conjunct 
imposes the boundedness requirement on the number of messages that can be 
stored by the hardware queue: At any point in time, the number of data elements 
in the hardware queue is less than or equal to Wh, where Wh is some constant 
of type natural number. The auxiliary predicate Bnd-F[wm is formally defined 
as follows. 
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Bnd-Hwm 

a, b G c,de Ddu°°\ t G N 



Vi G N : #/?©(«4-i) +#/?©(Hi) - [#^©(4i) +#^©(4i)] < t 



Interface Component The interface component communicates time-synchron- 
ously on the channels x and y. However, since the communication on the channels 
s and r is asynchronous, it cannot be specified in the time-synchronous format. 
As will be clear when we define specifications semantically in Sect. 4.2, time- 
synchronous communication cannot be captured in the time-independent format; 
this means that the time-dependent format is required. 

The interface component is specified as follows. 



|=liN i t 
in 


X : Gdh ; s : D 


= time_dependent = 


out 


y : Ddiu r : G 




asm 


TimeSynch{x) A Req-Ok(x) 




com 


Eq{x, r, G) A TimeSynch{y) A Eq{s, y, D) 





The keyword in the upper right corner specifies that INTF is time-dependent. 
In that case, the channel identifiers in the body represent timed infinite streams. 

The first conjunct of the assumption restricts the input on the channel x to 
be time-synchronous; this is expressed by an auxiliary predicate defined formally 
as follows. 

TimeSynch 

a G 



Vi GN:#M ©(«;,) =i 



Due to this assumption, the input frequency on x is bounded — a fact which 
may simplify the implementation of the specification. 

The task of the interface component is to convert the time-synchronous 
stream x into an ordinary timed stream r and to perform the opposite con- 
version with respect to s and y. This conversion can, of course, introduce ad- 
ditional delays. The second conjunct of the commitment makes sure that y is 
time-synchronous; the first and third conjunct employ an auxiliary predicate Eq 
to describe what it means to forward correctly. It is defined formally as follows. 
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.Eq. 



b G A G P(M) 



A®a=A®b 



Thus, Eq requires that the streams a and b are identical when projected on the 
set of messages A. For any set S, P(S') yields the set {T | T C S}. 

Software Queue The specification of the software queue is expressed in terms 
of FIFO and substitution of channel identifiers: 

SWQ = FIFO[m ^r,v^ s] 

The interpretation is as follows: The specification SWQ is equal to the specifica- 
tion obtained from FIFO by replacing its name by SWQ, the channel identifier 
M by r and the channel identifier n by s. 



Composite Specification The three specifications HWQ, INTF and SWQ can 
be composed into a composite specification describing the network illustrated by 
Fig. 1 as follows. 

1= FIFO^ET 
II i : Gdh 
out 0 : Ddu 

loc X : Gdiu r : G; y : Ddu] s : D 



(o,a:) := HWQ(j, 2 /) (y,r) ■.= mTF{x,s) (s) := SWQ(r) 



The keyword loc distinguishes the declarations of the four local channels from the 
declarations of the external input and output channels. The body consists of the 
three component specifications introduced above represented as nondeterministic 
assignments with the output channels to the left and the input channels to the 
right. We require that the sub-specifications of a composite specification have 
disjoint sets of output identifiers. The semantics of composite specifications is 
defined formally in Sect. 4.3. 

Both elementary and composite specifications are required to have disjoint 
sets of external input and output identifiers. In the composite case, the local 
channel identifiers must be different from the external channel identifiers. A 
composite specification is time-dependent, time-independent or time-synchron- 
ous depending on whether its elementary specifications are all time-dependent, 
time-independent or time-synchronous, respectively. Any other composite spec- 
ification is mixed. 
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4 Refinement 



The composite specification FIFO^ET is a conditional refinement of the ele- 
mentary specification FIFOts, and FIFOts is a conditional refinement of the 
elementary specification FIFO. In this section we argue the correctness of this 
claim by mathematical means. For this purpose, we first describe a schematic 
translation of any specification into the time-dependent format; then we define 
the semantics of elementary and composite time-dependent specifications in a 
more mathematical manner and introduce two concepts of conditional refine- 
ment. 



4.1 Schematic Translation into Time-Dependent Format 



The time-independent and time-synchronous formats can be understood as syn- 
tactic sugar for the time-dependent format. In fact, any specification written 
in the time-independent or the time-synchronous format can be schematically 
translated into a by definition semantically equivalent time-dependent speci- 
fication. For any time-independent elementary specification S, let TO(S') de- 
note the time-dependent specification obtained from S by replacing the keyword 
timeJndependent by time_dependent and any occurrence of any channel identifier 
c in the body of S by c. TO(S') captures the meaning of the time-independent 
specification S' is a time-dependent setting. 

Any elementary time-synchronous specification S can be translated into a by 
definition semantically equivalent time-dependent specification TO(S) by per- 
forming exactly the same modifications as in the time-independent case and, in 
addition, extending the assumption with the conjunct Time-Synch{i) for each 
input channel j, and the commitment with the conjunct Time-Synch(o) for each 
output channel o. 

For any time-dependent elementary specification, we define TO(S) = S. If S 
is composite then TO(S) is equal to the result of applying TO to its component 
specifications. 

For any elementary time-dependent specification S, by Is, Os, As and Cs 
we denote the set of typed input streams, the set of typed output streams, the 
assumption and the commitment of TO(S'), respectively; we say that {Is, Os) is 
the external interfaee of S. If S' is a composite specification we use Is, Os, Ls to 
denote the set of typed input, output and local streams of TO(S), respectively. 
For example, with respect to FIFO_NET of Sect. 3.3, we have 

fpiFO-NET = {* G Gdu—} 

Ofifo_net = {o G Ddu—} 

iFiFO_NET = {x G Gdu—, r G G-, y G Don—, s G D^} 
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Let F be a set of typed streams {v\ € Ti— , ■ ■ ■ ,Vn G Tn—} and P a formula. 
We define 

VF:P=Vm G Ti2^; G 

3V-.P = 3viGT^^; Vn€Tn^:P 



4.2 Semantics of Elementary Time-Dependent Specifications 

As already explained, there is a schematic translation of any time-independent 
or time-synchronous specification into the time-dependent format. 

To capture the semantics of elementary time-dependent specifications, we 
introduce some helpful notational conventions. Let T be a formula whose free 
variables are among and typed in accordance with 

V = {ViG ...,Vn€ Tn^} 

Hence, P is basically a predicate on timed infinite streams. By P\.j we char- 
acterise the “prefix of P at time j” . Plj is a formula whose free variables are 
among and typed in accordance with V. Plj holds if we can find extensions 
v\ . ,Vn' of vilj, . . . , v„lj such that P' holds. Formally, for any j € Nqo , we 
define Plj to denote the formula 

3 vi' G Ti— ; . . . ; vj £ Tn— : vilj Q vi A . . . A Vnij C vj A P' 

where P' denotes the result of replacing each occurrence of Vj in P by Vj' . Note 
that f’J.oo = P- has been introduced for the sake of convenience: It 

allows certain formulas, like the one below characterising the denotation of a 
specification, to be formulated more concisely. On some occasions we also need 
the prefix of P at time j with respect to a subset of free variables A C F; we 
define P\rA-.j to be equal to P\,j if the free variables F \ A are interpreted as 
constants. 

The denotation | S' ] of an elementary time-dependent specification S is 
defined by the formula: 

Vj G Nqo : ^ Cs\-is:j\-Os'j+'^ 

Informally, S requires that: 

partial input (j < oo) : The output is in accordance with the commitment Cg 
until time j -F 1 if the input is in accordance with the assumption until 
time j; 

complete input (j = oo) : The output is always in accordance with the com- 
mitment Cg if the input is always in accordance with the assumption A^. 
Note that 



OO :oo iOs :oo+l) ^ (^5 ^ ^ 5 ) 




An Exercise in Conditional Refinement 



401 



This one-unit-longer semantics, inspired by [1], requires a valid implementation 
to satisfy the commitment at least one time unit longer than the environment 
satisfies the assumption. This is basically a causality requirement: It disallows 
“implementations” that falsify the commitment because they “know” that the 
environment will falsify the assumption at some future point in time. Since no 
real implementation can predict the behaviour of the environment in this sense, 
we do not eliminate real implementations. Note that the assumption may refer to 
the output identifiers; this is often necessary to express the required assumptions 
about the input history (see HWQ in Sect. 3.3). 



4.3 Semantics of Composite Time-Dependent Specifications 

The denotation of a composite specification is defined in terms of the denotations 
of its component specifications. Let S' be a composite specification whose body 
consists of m specifications 

Si , . . . , S^i 

Its denotation [ S ] is characterised by the formula 



: [Si 1A...AIS™ ] 



As explained in Sect. 4.1, Lg is the set of typed local streams in TO(S). To 
simplify the formal manipulation of composite specifications, we define Si 
Sm to denote the composite specification S whose set of typed input, output and 
local streams are defined by 

Is ={UjL,Is,)\{UjL,Os,) 

Os = {UjL,Os,)\{UjL,Is,) 

Ls =(U™i7s,)n(U™iOs,) 

and whose body consists of the m specifications Si,..,Sm (represented in the 
form of nondeterministic assignments). For instance, FIFO^ET of Sect. 3.3 is 
equal to HWQ ® INTF ® SWQ. 

It can be shown (see [6]) that if the component specifications are all realizable 
by functions that are contractive with respect to the Baire metric then the 
composite specification is also realizable with respect to such a function. Hence, 
when specifications are all realizable in this sense then there is at least one 
fixpoint when they are composed. 



4.4 Two Concepts of Conditional Refinement 

Consider two specifications ^i and S 2 , and a formula B whose free variables are 
all among (and typed in accordance with) Is2 U Os2- 
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The specification S 2 is a behavioural refinement of the specification with 
respect to the eondition B, written^ 

Si S 2 



if 



Isi = Is2 , Osi = Os2, V 7sj ; Osj : B A [ 52 1 =1> I 51 ] 

Behavioural refinement allows us to make additional assumptions about the en- 
vironment since we consider only those input histories that satisfy the condition; 
moreover, it allows us to reduce under-specification since we only require that 
any input/output-history of S 2 is also an input/output-history of 5i, and not 
the other way around. 

That the external interfaces of the two specifications are required to be the 
same is often inconvenient. We therefore also introduce a more general concept 
that characterises conditional refinement with respect to an interface translation 
(see Fig. 2). Assume that U and D are specifications such that 

(Is2 \Isi) = lu, (Isi \ IS2) = Ou, (Osi \ OS2) = Id, (Os2 \ OsJ = Od 

The specification S 2 is an interfaee refinement of the specification 5i with respect 
to the condition B, the upwards relation U and the downwards relation D, 
written 



if 



U ^ Si ^ D B S 2 



U translates input streams of S 2 to input streams of 5i; D translates output 
streams of 5i to output streams of S 2 (in both cases, ignoring streams with 
identical names). Both the upwards and the downwards relations are defined as 
specifications, but these specifications are not to be implemented. Their task is 
to record design decisions with respect to the external interface. 

Let DS (for dummy specification) represent the specification whose external 
interface is empty ({}, {}). We may then define behavioural refinement in terms 
of interface refinement as follows 



5i 



(DS.DS) 

*2 = <5i B <52 



We define the following short-hands 



5i 52 = Si 



C (U,D) 
01 



_ c (U,D) 



52 = 5i 



^ In the same way as we distinguish between three formats for elementary specifica- 
tions, we could also distinguish between three formats for conditions. | ] could then 
be overloaded to conditions in the obvious manner. However, to keep things simple, 
we view any condition as a formula whose free variables represent timed infinite 
streams. 
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Fig. 2. Interface Refinement 



Behavioural refinement is “transitive” in the following sense 
Si S2 A S2 ~^b 2 S3 Si ~^BiaB 2 S3 

Interface refinement satisfies a similar property (see App. A). 

4.5 Correctness of the FIFO Decomposition 

We first argue that FIFOts is an interface refinement of FIFO; thereafter, that 
FIFO_NET is a behavioural refinement of FIFOts- This allows us to deduce 
that FIFO^ET is an interface refinement of EIEO. We base our argumentation 
on deduction rules formulated in App. A and proved sound in App. B. 

Since the external interfaces of EIEO and EIEOts are different, behavioural 
refinement is not sufficient; it is not just a question of renaming channels, we 
also have to translate time-synchronous streams with DU elements into streams 
without. The upwards and downwards relations are formally specified as follows. 



= Ufifo — time_dependent 

II i : Gdh 
out u : G 



com M = (GU{i/})©j 








in V : D 

out 0 : Ddu 




com ?; = (D U {1/}) © 0 
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Since in this particular case there are no assumptions to be imposed, the as- 
sumptions have been left out. It follows straightforwardly by the definition of 
behavioural refinement that 

FIFO FIFOts (1) 

where 

Bfifo = Time-Synch(i) 

The next step is to show that 
FIFOts FIFO^ET 
(2) is equivalent to 
FIFOts HWQ ® INTF ® SWQ 
Let 

FIFOlfg = FIFOts [* x, o y] 

Since is reflexive, (2) follows by the transitivity and modularity rules (see 
App. A.l, A. 2) if we can show that 

FIFOts HWQ ® FIFO^g (4) 

FIFOts INTF ® SWQ (5) 

Both (4) and (5) follow by the decomposition rule (see App. A. 3). (1), (2) and 
the transitivity rule give 

FIFO FIF0_NET (6) 

Thus, EIEO^ET is a conditional interface refinement of EIEO. 

5 Imposing Additional Boundedness Constraints 

The composite specification EIEOAJET does not impose constraints on the re- 
sponse time of the components: Replies can be issued after an unbounded de- 
lay. Moreover, the software queue was assumed to be unbounded. Since any 
computerised component has a bounded memory, this is not realistic. In fact, 
in EIEO_NET also the interface component is required to have an unbounded 
memory, since there is no upper bound on the message frequency for the channel 
s, and the communication on the channel y is time-synchronous. In this section, 
we show that the composite specification EIEO_NET can be refined into another 
composite specification EIEOAJETb in which response time constraints are im- 
posed, and where additional environment assumptions make both the software 
queue and the interface component directly implementable. The response time 
constraints are informally described as follows: 



(2) 

(3) 
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— The hardware queue has a required delay of exactly time units. Hence, 
it is required to reply to a request exactly time units after the request is 
issued. 

— The interface component has a required delay of not more than T* time units 
when forwarding messages from the hardware queue to the software queue. 
Note that the delay can be less than T* time units. Thus, the delay is not 
fixed as in the case of the hardware queue. 

— The interface and the software queue together have a maximal delay of Tg + 
2 * Ti time units, where * is the operator for multiplication. Any reply to 
a request made by the hardware queue is to be provided within this range 
of time. The software queue needs maximally Tg time units; the remaining 
2* Ti time units can be consumed by the interface. 

Tft, Ti and Tg are all constants of type natural number. That the interface 
component and the software queue together have a maximal delay of + 2 * T* 
time units does not mean that the interface component has a maximal delay 
of Ti time units when messages are forwarded from the software queue to the 
hardware queue. In fact, such a requirement would not be implementable. To 
see that, first note that the software queue may send an unbounded number of 
data elements within the same time unit. Assume, for example, that T* = 1, and 
that the software queue sends three data elements in the nth time unit. Since 
the communication on y is time-synchronous, the interface component needs at 
least three time units to forward these messages along y thereby breaking the 
requirement that no data element should be delayed by more than one time unit. 
In fact, for the forwarding from the software queue to the hardware queue, the 
requirement below is sufficient: 

— If at some point in time there are exactly e data elements that have been 
sent by the software queue, but not yet forwarded to the hardware queue, 
then the interface component will forward these data elements within the 
next Tj + (e — 1) time units. 

If the interface component satisfies this requirement then the hardware queue is 
guaranteed to receive a reply to each request within + 2 * T* time units. To 
see that, first note that the communication on x is time-synchronous. Together 
with the timing constraints imposed on the communication along r and s this 
implies that if e messages are received within the same time unit on the channel 
s then not more than one of these, namely the first, can be received with the 
maximal delay of T* + Tg time units with respect to the corresponding request 
on X] the second cannot be delayed by more than Ti + Tg — 1 time units, and 
so on. 

The constraint on the size of the internal memory is described informally as 
follows: 

— The FIFO queue is not required to store more than Wf data elements, where 
Wf is a constant of type natural number. Since the hardware queue can store 
Wh data elements, this means that the software part does not have to store 
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more than Wf — Wh data elements. Obviously, it holds that 
Wh < Wf 

Since the communication along x is time-synchronous, and the interface compo- 
nent has a maximal delay of Ti time units, the software queue will never receive 
more than T* requests within one time unit on the channel r. This means that 
Ti * Tg is an upper-bound on the number of data elements that can be sent by 
the software queue along the channel s within one time-unit. 

5.1 Hardware Queue 

The hardware queue is once more specified in the time-synchronous format. 
The only modification to the external interface is that the output channel o is 
renamed to w. This renaming is necessary since we want to translate o into w 
with the help of a downwards relation, and we require specifications (and thereby 
downwards relations) to have disjoint sets of input and output identifiers. 

= HWQb time-synchronous 

II i : Gdu', y '■ Ddu 
out w : Ddu; X : Gdu 



asm Req-Ok{i) A FIFO-Beh{x, y) 

Bnd-Rsp(x, y, Tg + 2 * T,, {Req}, D) A Bnd-Qum(i, T^, Wf) 



com Exact-Rsp(i, w, T^) A Req-Ok(x) A Bnd-Hwm(i, y, w, x, Wh) 
Bnd-Qum(x, Tg + 2 * T,, Wf — Wh) 



Throughout this paper: Line-breaks in assumptions and commitments represent 
logical conjunction. The two first conjuncts of the assumption have been in- 
herited from HWQ; the same holds for the second and third conjunct of the 
commitment. Clearly, the hardware queue can only be required to fulfil its re- 
sponse time requirement as long as the two other components fulfil their response 
time requirements; the third conjunct of the assumption therefore requires the 
two environment components to reply to a request made by the hardware queue 
within Tg+2* T* time units. The auxiliary predicate Bnd-Rsp is formally defined 
as follows. 

Bnd-Rsp 

a, b G M°° U t G N; A,B e P(M) 



Vi GN:#[^ ©(«;,)] <#[B©(Hi+i)] 
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The fourth conjunct of the assumption makes sure that the FIFO queue is never 
required to store more than Wf data elements; the second parameter is needed 
since there is a delay of time units between the transmission of a request and 
the output of the corresponding data element. The auxiliary predicate Bnd-Qum 
is formally defined as follows. 

Bnd-Qum 

a G M°° U t, n G N 



Vi G N : - #[{^!e9}©(4i)] < n 



The fourth conjunct of the commitment formalises a similar requirement for the 
output along the channel x] note that this requirement is slightly stronger than 
it has to be since, for simplicity, we do not allow the hardware queue to exploit 
that the software part may reply in less than Tg + 2* Ti time units. 

The first conjunct of the commitment represents both a strengthening and 
a weakening of the corresponding conjunct in HWQ. It employs an auxiliary 
predicate Exact-Rsp that is formally defined as follows. 

Exact-Rsp 

a G Gdu^; h G Doir-, t G N 



Vj G N+ : a.j = Req b.{j + t) = (T* ©«).(#[ {iZeg-} © (aj.^) ]) 



Exact-Rsp is a strengthening of EIEO-Beh in the sense that the reply is output 
with a delay of exactly t time units: If the jth message of the input stream a is a 
request then the {j + t)th message of the output stream b is the nth data element 
received on a, where n is the number of requests among the j first elements of 
a. Exact-Rsp is a weakening of EIEO-Beh in the sense that for any j such that 
a.j ^ Req, nothing is said about b at time j + t except that the message output 
is an element of Ddu. 



Correctness of Refinement Step Since the timed hardware queue may out- 
put an arbitrary data element in any time unit j + for which i.j ^ Req, it 
follows that HWQb is not a behavioural refinement of HWQ. However, we may 
find a downwards specification Dhwq , such that 

HWQ HWQb (7) 

where 

Bhwq = Bnd-Rsp(x, y, Tg + 2 * T,, {Req}, D) A Bnd-Qum{i, T^, Wf) 

Dhwq 



can be defined as follows. 
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The correctness of (7) follows straightforwardly by the definition of interface 
refinement. 



5.2 Interface Component 

The interface component is once more specified in the time-dependent format. 

=INTFb — time_dependent 

II X : Gdiu s : D 
out y : Ddu; r : G 

asm TimeSynch{x) A Req-Ok(x) A Bnd-Frq{s , Ti * Tg) 

Bnd-Qum(x, Tg + 2 * T,, Wf — Wh) 

com Eq{x, r, G) A TimeSynch{y) A Eq{s, y, D) A Bnd-Rsp(x , r, Ti, G, G) 
Weak-Rsp(s, y, Ti) A Bnd-Qum{r , Tg,Wf — Wh) 



The two first conjuncts of the assumption and the three first conjuncts of the 
commitment restate INTF. The third conjunct of the assumption imposes a 
bound on the number of data elements that can be received during one time 
unit on s (see Page 16). The auxiliary predicate Bnd-Frq is defined as follows. 

Bnd-Frq 

a G t G N 



Vj G N : (#a4.j+i - #a4.j) < t 



Constraints similar to the fourth conjunct of the assumption and the sixth con- 
junct of the commitment have already been discussed in connection with HWQb. 
The fourth conjunct of the commitment requires that the conversion from x to 
r does not lead to a delay of more than Ti time units; the fifth imposes a weaker 
timing constraint on the communication in the other direction: If at some point 
in time there are exactly e data elements that have been sent on s but not 






An Exercise in Conditional Refinement 



409 



yet forwarded along y, then the interface component will forward these e data 
elements within the next T* + (e — 1) time units. 

the number of received data elements on the channel s is larger than the 
number of data elements sent on y, then at least one data element is forwarded 
along y within the next time unit (see explanation on Page 16). This requirement 
is formalised by the auxiliary predicate Weak-Rsp as follows. 

Weak-Rsp 

a, 6 G t G N 



V j, e G N : © a\.j - © blj = e =1> © Hi+i+(e-i) > #-D © a\.j 



Correctness of Refinement Step Since the only difference between INTF 
and INTFb is that both the assumption and the commitment have additional 
constraints, it follows straightforwardly that 

INTF INTFb (8) 

where 

Bintf = Bnd-Frq(s, T, * Tg) A Bnd-Qum(x , Tg + 2 * Ti,Wf— Wh) 

5.3 Software Queue 

The software queue is specified in the time-dependent format as follows. 

= SWQb — time_dependent 

II r : G 
out s : D 

asm Req-Ok(r) A Bnd-Qum(r, Tg,Wf— Wh) A Bnd-Frq(r, T,) 

com FIFO-Beh(r, s) A Bnd-Rsp(r, s, Tg, {Req}, D) A Bnd-Frq(s, Ti * Tg) 



SWQb differs from SWQ in that both the assumption and the commitment have 
additional conjuncts. The assumption has been strengthened to make sure that 
the software queue is never required to store more than Wf — Wh data elements 
at the same point in time and never receives more than Ti messages within the 
same time unit. The additional conjuncts of the commitment require that the 
software queue replies to requests within Tg time units, and never sends more 
than Ti * Tg messages along s within the same time unit. 
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Correctness of Refinement Step Since SWQb differs from SWQ only in 
that the assumption and the commitment have additional conjuncts, it follows 
trivially that 

SWQ -.BswQ SWQb (9) 

where 

Bswq = Bnd-Qum{r, Tg, Wf - Wu) A Bnd-Frq{r, Ti) 

5.4 Composite Specification 

The three elementary specifications presented above can be composed into a new 
composite specification as follows. 

I=FIFO^ETb 
II i : Gdh 
out w : Ddu 

loc X : Gdiu r : G; y : Ddu] s : D 



(w,z) :=HWQb(*,2/) ( 2 /,r) :=INTFb(z,s) (s) := SWQB(r) 



Correctness of Refinement Step It must be shown that 

FIFO -°®''“'^''bpipoABp:po' FIFO^ETb (10) 

where 

Bfifo^ = Bnd-Qum{i, T^, Wf) 

By (6) and the transitivity rule it is enough to show that 

EIEO^ET EIEO_NETb (11) 

(11) follows by the modularity rule (see App. A. 2). 

6 Conclusions 

This paper builds on earlier research, both by us and others: In particular, 
the notion of interface refinement is inspired by [3]; the notion of conditional 
refinement is investigated in [13]; similar concepts have been proposed by others 
(see for example [1]); the one-unit-longer semantics is adapted from [1]; the 
particular form of decomposition rule is discussed in [12]; the specification style 
has been taken from [4]. 
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Specification languages based on the assumption/commitment paradigm have 
a long tradition. In fact, this style of specification was introduced with Hoare- 
logic [8]. The pre-condition of Hoare-logic can be thought of as an assump- 
tion about the initial state; the post-condition characterises a commitment any 
correct implementation must fulfil whenever the initial state satisfies the pre- 
condition. Well-known methods like VDM [10] and Z [11] developed from Hoare- 
logic. VDM employs the pre/post-style. In Z the pre-condition is stated im- 
plicitly and must be calculated. Together with the more recent B method [2], 
VDM and Z can be seen as leading techniques for the formal development of 
sequential systems. In the case of concurrency and nonterminating systems the 
assumption/commitment style of Hoare-logic is not sufficiently expressive. The 
paradigm presented in this paper is directed towards systems in which interac- 
tion and communication are essential features. Our approach is related to [1]. 
In contrast to [1] we work in the setting of streams and asynchronous message 
passing. 

Conditional refinement is a flexible notion for relating specifications written 
at different levels of abstraction. Conditional refinement supports the introduc- 
tion of boundedness constraints in specifications based on unbounded resources; 
in particular: 

— Replacing purely asynchronous communication by time-synchronous com- 
munication. 

— Replacing unbounded buffers by bounded buffers. 

— Imposing additional boundedness constraints on the size of internal memo- 
ries. 

— Imposing additional boundedness requirements on the timing of input mes- 
sages. 

Conditional refinement is a straightforward extension or variant of well-known 
concepts for refinement [9,10]. Traditional refinement relations typically allow 
the assumption to be weakened and the commitment to be strengthened modulo 
some translation of data structure. Conditional refinement allows both the as- 
sumption and the commitment to be strengthened; however, any strengthening 
of the assumption is recorded in a separate condition. 

There are several alternative, not necessarily equivalent, ways to define con- 
ditional refinement. For instance, the directions of the upwards and downwards 
relations could be turned around. In that case we get the following definition of 
interface refinement 

Si S2=Si D^S2®U 

where B is a constraint on the external interface of S\ . Alternatively, by allowing 
B to refer to the external interfaces of both S\ and S 2 we could formulate the 
upwards and downwards relations within B. Which alternative is best suited 
from a pragmatic point of view is debatable. 

According to [5], hardware/software co-design is the simultaneous design of 
both hardware and software to implement a desired function or specification. 
The presented approach should be well-suited to support this kind of design: 
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— It allows the integration of purely asynchronous, hand-shake (see [13]) and 
time-synchronous communication in the same composite specification. 

— It allows specifications based on asynchronous communication to be re- 
fined into specifications based on time-synchronous communication, and 
vice-versa. 
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A Three Rules 

In this paper we use several rules. Some of these rules are very simple and there- 
fore not stated explicitly. For example, to prove (1) of Sect. 4.5 we need a rule 
capturing the definition of behavioural refinement. Three less trivial rules are 
formulated below; in App. B we prove their soundness. Any free variable is uni- 
versally quantified over infinite timed streams of messages typed in accordance 
with the corresponding channel declaration. 

A.l Transitivity Rule 




— Premises 1 and 2 make sure that the overall condition B is stronger than 
and B 2 . 

— Premise 3 makes sure that the upwards relation obtained by connecting U 2 
and Ui is allowed by U. 

— Premise 4 makes sure that the downwards relation obtained by connecting 
Di and D 2 is allowed by D. 

— Premises 5 and 6 are illustrated by Fig. 3. 
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A. 2 Modularity Rule 



3 ■■ UijAlDij) 

C7, ])=^[ C7] 

{AUl Di 

BA{AUlSll)^AUBi 



\ <»uSi ^ -""b si 

\V : P holds if there is a unique V such that P holds. Some intuition: 

— Premise 1 makes sure that for each concrete input history such that B* 
holds, the upwards relation Ui allows exactly one abstract input history. 
Hence, each concrete input history that satisfies the condition is related to 
exactly one abstract input history, but the same abstract input history can 
be related to several concrete input histories. 

— Premise 2 makes sure that the conjunction of the downwards and upwards 
relations is consistent. 

— Premise 3 makes sure that the upwards relation described by the conjunction 
of the n upwards relations C/* is allowed by the overall upwards relation U. 

— Premise 4 makes sure that the upwards relation described by the conjunction 
of the n downwards relations is allowed by the overall downwards relation 
D. 

— Premise 5 makes sure that the assumptions made by the n Bi’s are fulfilled 
by the composite specification consisting of the n specifications S'- when the 
input from the overall environment satisfies B. 



A. 3 Decomposition Rule 



A A A2-Io 



Vj G hi . .4 A A (^2^-12 ' A . 424 .J- 1-1 

A A (Cl) A (C2) => (Ai A (Cl ^ ^ 2 )) V (A2 A (C2 => Ai)) 

Vj G hloo • A f-U + 1 A C 2 j ^/2 :J^02 0“t“ 1 C\-J-j\-0-j-l-l 



S Si ® S 2 
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(P) denotes the upwards closure of P] formally (P) = j £ N : P Ij . S is a 
specification with assumption A, commitment C, and external interface (7, O). 
As illustrated by Fig. 4, the relationship between S'i/S '2 and 

Ai,Ci, (7i, Oi)/ A 2 , C 2 , {h, O 2 ) 

is defined accordingly. As explained in detail in App. B.3, the correctness of the 
decomposition rule follows by induction. Some intuition: 

— Premise 1 makes sure that the component assumptions Ai and A 2 hold at 
time 0. 

— Premise 2 makes sure that the component assumptions Ai and A 2 hold at 
time j + 1 if the component commitments Ci and C 2 hold at time j + 1. 

— Premise 3 is concerned with liveness in the assumptions. This premise is not 
required if both Ai and A 2 are upwards closed. 

— Premise 4 makes sure that the overall commitment C holds at time j + 1 if 
the component commitments C\ and C 2 hold at time j + 1. 




Fig. 4. Network Represented by Si ® S 2 



B Soundness of Rules 



In this appendix we prove that the three rules formulated in App. A are sound. 
Any free variable is universally quantified over infinite timed streams of messages 
typed in accordance with the corresponding channel declaration. 
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B.l Transitivity Rule 

Given 



B A [ C/2 1 A [ 7?2 1 ^ 


(12) 


B B 2 


(13) 


I C7i ] A [ C/2 1 [ C7 1 


(14) 


[ 1 A [ B 2 1 B ] 


(15) 


C 

<51 Bi <52 


(16) 


C c 

<52 B 2 <53 


(17) 


It must be shown that 




(U,D) 

<51 B <53 


(18) 


(18) is equivalent to 




BA[S'3]^|C7®S'i®B] 


(19) 


To prove (19), assume there are Igg, Ogg such that 




B 


(20) 


ISzl 


(21) 


(13), (20) imply 




B 2 


(22) 


(17), (21), (22) imply 




[ C /2 ® <92 ® B 2 ] 


(23) 


(23) implies there are Ou 2 ,Id 2 such that 




I U 2 ] 


(24) 


[52 1 


(25) 


ID 2 1 


(26) 


(12), (20), (24), (26) imply 




Bi 


(27) 


(16), (25), (27) imply 




[ C7i ® Bi ® Bi ] 


(28) 
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(28) implies there are Oui,Idi such that 



I U, ] (29) 

I 1 (30) 

I A 1 (31) 

(14) , (24), (29) imply 

[ U ] (32) 

(15) , (26), (31) imply 

I D 1 (33) 

(30), (32), (33) imply 

[ C7 ® 51 ® /I 1 (34) 



The way (34) was deduced from (20), (21) implies (19). 



B.2 Modularity Rule 

Given 

AUiBi^lOu.-lUi]) (35) 

3 ■■ A”.i([ C/i 1 A [ A 1) (36) 

(A^LiI Al)^[ G] (37) 

(A^Li[A])->[7?] (38) 

BA(a:LiI5'])^a:LiA (39) 

S') (40) 

It must be shown that 

®?.i 5' (41) 

(41) follows if we can show that 

BA[®;Li 5'1^I f7®(®r=i5,)®fl] (42) 

To prove (42), assume there are 70 »_ g>,L 0 n_ g' , g' such that 
B (43) 

Ar.i[5'l (44) 

(39), (43), (44) imply 






(45) 
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(40), (44), (45) imply 

AUl Ui ^ Si ^Di I (46) 

(46) implies 

A^^,3 Ou„Id,-- I UijAl Si jAlDij (47) 

(35), (36), (47) imply there are such that 

r^UlUil (48) 

] (49) 

/\U\Di\ (50) 

(37), (38), (48), (50) imply 

I U ] (51) 

I D 1 (52) 

(49), (51), (52) imply 

[ C7® (^"^iS'i) 1 (53) 

The way (53) was deduced from (43), (44) implies (42). 



B.3 Decomposition Rule 

Given 



■A. => ^i4o A A 2 I 0 (54) 

\/ j € N : ^ A A C2i-i2-.ji-02-j+i ^i4-j+i A ^24-i+i (55) 

A A (Cl) A {C 2 ) => (^1 A (Gi ^ 7 I 2 )) V ( 7 I 2 A (C 2 => Ai)) (56) 

Vj G Hoc • -^42 A Gr 4/i :j40i + 1 A G24/2 :j4o 2 0“t“ 1 G4/:j40:j + l (5*7) 

It must be shown that 

S S\ ® S 2 (58) 

(58) is equivalent to 

[ ® 52 ] ^ [ 5 ] (59) 

(59) is equivalent to 

(V J G Nqo • ^l4; ^ Gl4/i :j40i :j + l ) A 
(V J G Nqo • A2~l-j ^ G24/2:j4o2:i+l) 

^ (60) 

(V j G Nqo • Alj ^ C\.j-,j\-o-.j + l) 
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(60) is equivalent to 
Vi G Noo : 

A\.j A 

(Vi G Noo • ^l4-i ^ ^ 

(Vi G Noo • ^24-i ^ C 2 \-l 2 :j\- 02 -j+l') 

=> (61) 

To prove (61), assume there are I, O, I such that 

Ail (62) 

Vi G Noo • ^l4-i ^ ^lili:jiOi:j+l (^^) 

Vi G Noo : A 2 ij => C 2 ii 2 -.jio 2 -.j+i (64) 

There are two cases to consider. Assume 

I < 00 (65) 

(54), (55), (62), (63), (64) and induction on j imply 
j < I => Aiij A A2ij (66) 

(63), (64), (66) imply 

C'lJ'/i:/J'Oi:/+l A C2il2-.lio2-l+l (67) 

(57), (62), (67) imply 

Cii-.iio-.i+i (68) 

The way (68) was deduced from (62), (63), (64) proves (61) for the case that 
(65) holds. Assume 

I = 00 (69) 

By the same inductive argument as above 

{Ci)A{C2) (70) 

(56) , (62), (69), (70) imply 

(Ai A(f7i ^ A2))V(A2A(f?2 ^ Ai)) (71) 

(63), (64), (71) imply 

Cl A C 2 (72) 

(57) , (62), (69), (72) imply 

C (73) 

The way (73) was deduced from (62), (63), (64) proves (61) for the case that 
(69) holds. 
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Abstract. The goal of deductive design is the systematic construction 
of a system implementation starting from its behavioural specification 
according to formal, provably correct rules. We use Haskell to formulate 
a functional model of directional, synchronous and deterministic systems 
with discrete time. The associated algebraic laws are then employed in 
deductive hardware design of basic combinational and sequential circuits 
as well as a brief account of pipelining. With this we tackle several of 
the IFIP WG 10.5 benchmark verification problems. Special emphasis is 
laid on parameterisation and re-usability aspects. 



1 Introduction 

1.1 Deductive Design 

The goal of deductive design is the systematic construction of a system im- 
plementation, starting from its behavioural specification, according to formal, 
provably correct rules. The main advantages are the following. 

First, the resulting implementation is correct by construction. Moreover, im- 
plementations can be constructed in a modular way: in the first stage, the main 
emphasis lies on correctness, while in further stages transformations can be used 
to increase efficiency. 

Second, the rules can be formulated schematically, independent of the partic- 
ular application area; hence they are re-usable for wide classes of similar prob- 
lems. 

Third, being formal, the design process can be assisted by machine. This 
helps avoiding clerical errors and disallows “cheating” during the derivation. 
Moreover, a formal derivation also serves as a record of the design decisions 
that went into the construction of the implementation. It is an explanatory 
documentation and eases revision of the implementation upon modification of 
the system specification. 

Note that we do not view deductive design as alternative to, but complemen- 
tary to verification. In fact, usually transformational derivations will be inter- 
leaved with verification of lemmas needed on the way. Conversely, verification 
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Hardware Design Methods. 
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may benefit from incorporation of standard reduction strategies based on alge- 
braic laws. Some authors even subsume verification approaches under deductive 
design (see eg. [15]). 

There is a variety of approaches to deductive design in our narrower sense, 
eg. refinement calculus, program extraction from proofs and transformational 
programming. We shall follow the latter approach (see e.g. [7, 33]) and use mainly 
equational reasoning, algebraic laws, structural induction and fixpoint induction 
for recursive definitions. 

We exemplify deductive hardware design in the particular area of directional, 
synchronous and deterministic systems with discrete time. Directionality means 
that input and output are clearly distinguished. As for synchronicity we assume 
that our systems are clocked, in particular, that the clock period is long enough 
that all submodules stabilise their output within it. 

The approach generalises with varying degrees of complexity to adirectional 
systems, asynchrony, non-determinacy or continuous time. Adirectionality, as 
found eg. in buses, may be modeled better in a relational than in a functional 
setting, handshake communication may more adequately be treated by formal 
systems based on CCS, CSP or ACP as presented eg. in [2,23,22]. 

We show deductive design of basic combinational and sequential circuits, 
notably systolic ones, and give a brief transformational account of pipelining. 
Special emphasis is laid on parameterisation and re-usability aspects. 

1.2 The Framework 

We model hardware functionally in Haskell. The reasons for this are the follow- 
ing. 

First, functional languages support various views of streams directly, eg. 
through lazy lists or functions from time to data as first-class objects. 

Second, polymorphism allows generic formulations and hence supports re-use. 

Third, since all specifications are executable, direct prototyping is possible. 

Fourth, functional languages are being considered for their suitability as bases 
of modern hardware description languages. Examples are — in historical order 
— Hydra [32], MHDL [34] (unfortunately abandoned). Lava [3,36], HAWK [9, 
26, 18] and SLDL [37] (which is still in the requirements definition phase). Many 
other approaches to hardware specification and verification also use higher-order 
concepts to good advantage (see e.g. [16]). 

Finally, a transformation system ULTRA for the Gofer sublanguage of Has- 
kell is being constructed at the University of Ulm [35]. It is an adaptation of 
the system CIP-S [6] . A prototype version of ULTRA has been used to formally 
check most of the laws and derivations in our paper (which originally were done 
with paper and pencil) by machine. While the overall derivations were found to 
be correct, a number of minor errors and missing side conditions were discovered. 
The set of transformation rules obtained in this way can be re-used for further 
derivations that now, of course, should take place directly on the system. 

Of course, transformational derivation of circuits is not a new idea (see e.g. 
[19,21,11,12]). What we view as novel is the exploitation of the full power 
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of Haskell for specifying circuits at a purely “mathematical” level that is far 
removed from that of concrete hardware. This leads to very simple and concise 
specifications. An approach that is close to this spirit is presented in [5], using, 
however, a purely mathematical language that is not directly executable. 

At the same time, by taking a high-level algebraic approach we are able to 
keep the derivations from that level to the actual implementation level surpris- 
ingly short. This is partially due to the fact that the problems of wiring do enter 
into our derivations at a very late stage and hence do not clutter the essential 
steps. 

By exploiting the polymorphism of Haskell we can use part of the algebra 
for a uniform treatment of combinational and sequential circuits. An essential 
subalgebra is network algebra as investigated eg. in [38]. 

The introduction of small but very effective sets of algebraic laws for special 
subproblem areas such as the treatment of delay and slowdown make the ap- 
proach particularly streamlined. Some of these laws correspond to manipulations 
of layout graphs; whenever appropriate, we therefore perform the derivations at 
the graphical level to make them easier to grasp. 

The approach has also been used quite successfully in teaching the essentials 
of hardware to first-year students (who, of course, had been exposed to the Gofer 
sublanguage of Haskell in the beginning of the year) . 

A brief review of the essential concepts of Haskell and a notational extension 
that are used in this paper can be found in the Appendix. 



Part I: Combinational Circuits 

2 A Model of Combinational Circuits 

We start with the simple case of combinational circuits. While this avoids some 
of the complexity of dealing with streams, it already allows us to introduce a 
substantial part of the underlying algebra, which will be re-used for the stream 
case. Moreover, in an extended case study, we introduce the main ingredients 
of the transformational approach, viz. the unfold/fold strategy, generalisation, 
parameterisation, abstraction and re-use of designs. All this will reappear in the 
stream-based treatment of sequential circuits. 



2.1 Functions as Modules 

A combinational module will be modeled as a function taking a list of inputs 
to a list of outputs. This function reflects the behaviour at one clock tick. This 
mirrors the underlying assumption of synchrony: the complete list of inputs must 
be available before the output can be computed. Strictly speaking, this conflicts 
with the lazy semantics of Haskell] however, we shall employ such functions 
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only in contexts where the difference between eager and lazy semantics does not 
matter. 

Using lists of inputs and outputs has the advantage that the basic connection 
operators can be defined independent of the arities of the functions involved. The 
disadvantage is that we need uniform typing for all inputs/outputs, since Haskell 
does not allow heterogeneous lists. 

So we assume some basic type a that is the direct sum of all data types 
involved, such as integers, Booleans, bytes etc. All our definitions will be poly- 
morphic in that type a. Then a function f describing a module with m inputs 
and n outputs will have the type f : : Module with 

type Module = [a] -> [a] 

However, f will be defined only for input lists of length m and always produce 
output lists of length n. Assuming 

[ol, . . . ,on] = f [il, . . . ,im] 
we represent such a module diagrammatically as 



^1 

id 




We now discuss briefly the role of functions as modules of a system. In a 
higher-order language such as Haskell there are two views of functions: 

— as routines with a body expression that depends on the formal parameters, 
like in first-order languages; 

— as “black boxes” which can be freely manipulated by higher-order functions 
(combinators) . 

The latter view is particularly adequate for functional hardware descriptions, 
since it allows the direct definition of various composition operations for hard- 
ware modules. 

However, contrary to other approaches we do not reason purely at the com- 
binator level, i.e. without referring to individual in/output values. While this 
has advantages in some situations, it can become quite tedious in others. So we 
prefer to have the possibility to switch. 

The basis for showing equality of two expressions that yield functions as their 
values is the extensionality rule 

f = g iff fx = gx for all x . 

Many algebraic laws we use are equalities between functions, interpreted as ex- 
tensional equalities. 
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Example 2.1. Function composition is defined in Haskell by 
(f . g) X = f (g x) 
with polymorphic combinator 

(.) :: (b -> c) -> (a -> b) -> a -> c 
A fundamental law is associativity of composition: 

(f . g) . h = f . (g . h) 

□ 



2.2 Modelling Connections 

We shall employ two views of connections between modules: 

— that of “rubber wires, represented by formal parameters or implicitly by 
plugging in subexpressions as operands; 

— that of “rigid wires”, represented by special routing functions which are 
inserted using basic composition combinators. 

Contrary to other approaches (e.g. [19,21]) we proceed in two stages: 

— We start at the level of rubber wiring to get a first correct implementation. 

— Then we (mechanically) get rid of formal parameters by combinator abstrac- 
tion to obtain a version with rigid wiring. 

This avoids introducing wiring combinators at too early a stage and carrying 
them through all the derivation in an often tedious manner. 

In drawing diagrams we shall be liberal and use views in between rubber 
and rigid wiring. In particular, we shall use various directions for the input and 
output arrows. So input arrows may not only enter at the top but also from the 
right or from the left; an analogous remark holds for the output arrows. 

Example 2.2 (Splicing). An operation we use in several examples below is 
that of splicing two modules together along one wire. This means that one output 
wire of the second module serves as an input to the first module. This can eg. 
be used for passing carries from one module to the next. It is modeled by 

splice : : Int -> Module -> Module -> Module 

defined as 

splice m f g (xs++[c]) = f (take m xs ++ [u] ) ++ us 
where (u:us) = g (drop m xs ++ [c] ) 

Assume now 



xs = [xl , . . . ,xm,x(m+l) , . . . ,xn] 
[ul,u2, . . . ,up] = g [x(m+D , . . . ,xn,c] 
[vl , v2 , . . . , vk] = f [xl , . . . ,xm,ul] 
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Then we may depict splice m f g xs as 



Uj X„.iX„C 






TT 






We straighten and move some wires to obtain the following form: 

Xi ^m+l^n 

I __ i 

g 

^1 ^k Uj Up 



□ 



Lemma 2.3. Splicing is associative in the following sense: 

(f ‘splice m‘ g) ‘splice (m+k) ‘ h = 
f ‘splice m‘ (g ‘splice k‘ h) 

Moreover, the identity id on singleton lists is left and right neutral w.r.t. splice. 

Associativity is essential in that it shows that the functional model ad- 
equately describes the graphical and layout views of hardware: there are no 
“parentheses” in circuits, and hence the mathematical model should not depend 
on parenthesisation. 



2.3 Wire Bundles 



Often we need to deal with wire bundles. In the case of circuits for binary 
arithmetic operators it is usually assumed that the wires for the single bits of 
the two operands are interleaved (or shufHed) in the following fashion: 




y„- 



Yn -3 

^n-2 I ^n-1 



Yn- 



I I I j 



So the bits for one operand occur at even positions in the overall list of inputs, 
those for the other one at odd positions. To extract the corresponding sublists 
we use 

evns xs = [ xs ! ! i | i <- [0 . .length xs -1] , even i ] 
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odds xs = [ xs ! ! i I i <- [0 . .length xs -1] , odd i ] 

Yo Yl Y2 Yn-3 Yn-3 Yn-3 

Y y Y Y Y Y 

Recursive versions of these functions are 
evns [] = [] 

evns (x:xs) = x : odds xs 
odds [] = [] 

odds (x : xs) = evns xs 

The converse is shuf k which shuffles two lists of length k, say xs and ys, 
into one list of length 2*k. 

X. X, X , y, y , 

0 1 n-l-‘0-'l -^n-1 




Following our general principle that every module takes one list of inputs, 
we have to concatenate ys and zs into one list xs. Then shuf is specified by 

(shuf n xs) ! ! (2*i) = xs ! ! i 

(shuf n xs) ! ! (2*i+l) = xs ! ! (n+i) 

for length xs == 2*n and i <- [0..n-l]. This is an implicit specification; 
the patterns on the left hand side are not legal Haskell patterns. However, the 
clauses of this specification will be used as algebraic laws in derivations. An 
explicit version is 

shuf n xs 

I length xs == 2*n = ileave (take n xs) (drop n xs) 

ileave [] [] = [] 

ileave (y : ys) (z : zs) = y : (z : ileave ys zs) 

3 Numbers and Their Representation 

We now briefly leave the field of circuits. As a preparation for the derivation 
of some basic arithmetic circuits we need some definitions concerning the rep- 
resentation of natural numbers w.r.t. a base p. To simplify matters we use the 
nonnegative part of the Haskell type Int for treating natural numbers; a different 
possibility would be the definition of a recursive data type 
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data Nat = Zero | Succ Nat 

We avoid this, since it would necessitate a lengthy redefinition of all arithmetic 
operators. 

To characterise p-adic digits we use the auxiliary predicate 

below : : Int -> Int -> Bool 
n ‘below' m = 0<=n&&n<m 

Then d is a p digit iff d ‘below' p. Lists of length k consisting only of p-adic 
digits are characterised by 

digits : : Int -> Int -> [Int] -> Bool 

digits p k xs = length xs == k && all (‘below' p) xs 

Now we define representation and abstraction functions between (the nonnega- 
tive part of) Int and lists of p-adic digits. To cope with bounded word length, 
we parameterise them not only with p but also with the number of digits to be 
considered. 

First we define the representation function 

code : : Int -> Int -> Int -> [Int] 

Its first argument p is the base of the number system; for n > 0 the result of 
code p k n is defined only for p > 1. The second argument k is the number of 
digits we want to consider. An exact representation with k digits is only possible 
for numbers n with n 'below' p"k. Hence for other numbers n the result of 
code p k n is undefined. Otherwise it is the p-adic representation of n in k 
digits precision, padded with leading zeros if necessary). The definition reads 

code p 0 0 = [] 

code p (k+1) n = code p k (n 'div' p) ++ [n 'mod' p] 



Example 3.1. code 2 5 24 = [1, 1, 0, 0, 0] 

code 2 7 24 = [0, 0, 1, 1, 0, 0, 0] 



□ 



For the corresponding abstraction function 

deco : : Int -> Int -> [Int] -> Int 

the result of deco p k xs is the number represented by the list xs of p-adic 
digits. Again k is the number of digits expected; the result is undefined if xs has 
a length different from n. The definition reads 

deco p 0 [] = 0 

deco p (k+1) xs = (deco p k (init xs)) * p + last xs 

These particular abstraction and representation functions have been intro- 
duced in [13]. They are useful in that they admit induction/recursion over the 
parameter k. They enjoy pleasant algebraic properties: 
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Lemma 3.2. The functions code and deco are inverses of each other: 

deco p k (code p k n) = n <== n ‘below' p"k 

code p k (deco p k xs) = xs <== digits p k xs 

Moreover, we have the decomposition /distrihutivity properties 

code p (j+k) (m * p"k + n) = code p j m ++ code p k n 
<== m ‘below' p"j && n 'below' p"k 
deco p (j+k) (xs ++ ys) = (deco p j xs) * p"k + deco p k ys 

<== digits p j xs && digits p k ys 

By the sign <== we mean logical implication from right to left; it is not part of 
official Haskell. 

These properties are verified by structural induction over the lists involved 
using the following properties of div and mod (see [13]): 

Lemma 3.3. Assume that p>0. 

X = X 'div' p * y + X 'mod' p 

(x+y) 'mod' z = y 'mod' z <== (x 'mod' z) = 0 

(x+y) 'div' z = X 'div' z + y 'div' z <== (x 'mod' z) = 0 

(x 'div' p"m) 'div' p"n = x 'div' p"(m+n) 

(x 'mod' p"m) 'mod' p"n = x 'mod' p"(min m n) 

(x 'mod' p"m) 'div' p"n = 

(x 'div' p"n) 'mod' p"(max 0 (m-n)) 

(x 'div' p"m) 'mod' p"n = (x 'mod' p"(m+n)) 'div' p"m 

4 Development of an Adder 

As our first case study we derive several adders. This tackles one of the IFIP 
verification benchmarks [20]. Moreover, in this example we demonstrate the basic 
techniques and strategies of deductive design by transformation. 

We specify a generic adder function 

add : : Int -> Int -> [Int] -> [Int] 

The first parameter is the base for the number representation, the second the 
number of digits we treat. For the specification we assume that the list zs is the 
shuffle of the digit lists for the two summands, ie. that digits p (2*k) zs holds. 
Then we specify 

add p k zs = code p (k+1) (deco p k (evns zs) + 

deco p k (odds zs) ) 

The length k+1 for the result list serves to accommodate a possible overflow 
digit. We illustrate this by a diagram, using temporarily Dig as the subtype of 
Int comprising the p-adic digits: 
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+ 




[Dig] X [Dig] ► [Dig] 

add 



This specification does not yet provide a particular adding algorithm which 
could be directly taken as the description of an abstract layout. It clearly sep- 
arates “what” from “how” and allows quite different implementations, such as 
carry-ripple and carry-lookahead adders. This will be exploited in later stages of 
our derivation. 



4.1 The Unfold/Fold Strategy 

Our first goal is now to derive an inductive (recursive) version of add that does 
no longer refer to deco and code and uses only operations on single digits. 

The derivation is driven by the recursion structure of the abstraction and 
representation functions. It follows a general strategy that can be partly auto- 
mated, eg. by transformation tactics in ULTRA, and, in the present case, does 
not require great amounts of intuition. This classical unfold/fold strategy (see 
e.g. [33]) consists of the following steps: 

— Unfold the definitions of deco and code. 

— Simplify and rearrange. 

— Fold with the definition of add to get recursive calls. 

The derivation follows the case analysis of deco and code. 

For k=0 we calculate 

add p 0 [] 

= ([ unfold add ]} 

code p 1 (deco p 0 [] + deco p 0 [] ) 

= {[ unfold deco, neutrality of 0 ]} 

code p 1 0 
= ([ unfold code ]} 

code p 0 (0 ‘div‘ p) ++ [0 ‘mod' p] 

= ([ arithmetic and unfold code ]} 

[] ++ [ 0 ] 

= {[ neutrality of [] ]} 



[ 0 ] 
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This is the termination case; here the overflow digit is 0. 

For k > 0 we calculate, assuming xs = evns zs and ys = odds zs, 

add p (k+1) (zs ++ [x,y]) 

= -{[ unfold add ]} 

code p (k+2) (deco p (k+1) (xs ++ [x] ) + 
deco p (k+1) (ys ++ [y] ) ) 

= -{[ unfold deco ]} 

code p (k+2) ((deco p k xs)*p + x + (deco p k ys)*p + y ) 

= -{[ arithmetic ]} 

code p (k+2) ((deco p k xs + deco p k ys)*p + x + y) 

= unfold code J} 

code p (k+1) (z ‘div‘ p) ++ [z ‘mod' p] 
where z = deco p k xs + deco p k ys)*p + x + y 

= {[by Lemma 3.3 [} 

code p (k+1) (deco p k xs + deco p k ys + (x + y) ‘div‘ p) 

++ [(x + y) ‘mod' p] 

This expression is almost foldable. However, in the call to code we have the 
additional summand (x + y) 'div' p, so that we are stuck! 



4.2 Generalisation 

A strategy that frequently helps when direct folding is not possible is generali- 
sation. It works in two stages. 

— First one introduces additional parameters, which may be completely new 
ones or abstractions of constants in the original specification. These constants 
may even be “invisible” neutral elements which need to be made explicit first. 

— Then one uses the additional degrees of freedom to make the derivation go 
through. 

The original problem is then solved by instantiating the solution for the gener- 
alised problem; this is also known as embedding the original problem into the 
generalised one. This strategy is well-known from inductive proofs: there one 
frequently needs to generalise the induction hypothesis to make the proof go 
through. 

In the case of our adder we introduce a parameter for the extra summand 
that prevented the folding. The generalised specification reads 

cadd p k (xs ++ [c] ) = 

code p (k+1) (deco p k (evns xs) + deco p k (odds xs) + c) 
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If one wishes to interpret this, then the new parameter c is the carry. But note 
that it has been introduced purely formally, “without thinking” , as part of the 
generalisation strategy! In fact, this strategy can again be partly automated. 
The original problem is retrieved via the embedding 

add p k xs = cadd p k (xs ++ [0] ) 

Now we can replay the derivation for cadd. This results in 

cadd p 0 [c] = [c] 

cadd p (k+1) (xs ++ [x,y,c]) = 

cadd p k (xs ++ [(x+y+c) ‘div‘ p] ) ++ [(x+y+c) ‘mod' p] 

We need to ensure that the expression (x+y+c) 'div' p always yields a proper 
digit. The maximal values for x and y are p-1. In this case we have x+y+c = p + 
(p+c-2) so that the quotient by p is at least 1. Since 1 is the only digit that exists 
in all number systems, notably the binary system, we have to guarantee that the 
quotient does not exceed 1. Hence we need the additional assertion c ‘below' 
2. Fortunately, this assertion is preserved as an invariant of the recursion, ie. 
if it holds for c it also holds for the new carry (x+y+c) 'div' p. We forego a 
formal treatment of assertions here and refer to [28] instead. 

4.3 Modularisation 

The resulting expression for the recursive case is very complex. We structure it 
by packing the two expressions for the last digit and the new carry in cadd into 
a function fa defined by 

fa p [x,y,c] = [(x+y+c) 'div' p, (x+y+c) 'mod' p] 

X y 

(x+y+c) ' div ' p 

i 

(x+y+c) 'mod' p 

Of course, fa is the full adder function. But note again that this is introduced 
purely formally! 

Now we may use splicing (cf. Section 2.2) to obtain 

cadd p (k+1) = splice (2*k) (cadd p k) (fa p) 





k 
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For fixed n we may now unwind the recursion to obtain the well-known 
regular design of the carry ripple adder: 




The associativity of splicing is essential here; it allows this “parenthesis-free” 
graphical layout. Based on the decomposition properties for code and deco we 
can also show a decomposition property for cadd : 

Lemma 4.1. cadd p (k+m) = splice (2*k) (cadd p k) (cadd p m) 




Proof. Consider a list zs ++ zs ’ ++ [c] with length zs = 2*k and length 
zs ’ = 2*m and set 

xs = evns zs, ys = odds zs, 

xs ’ = evns zs’, ys ’ = odds zs’ 

Then we calculate 

cadd p (k+m) (zs ++ zs’ ++ [c] ) 

= -{[ unfold cadd ]} 

code p (k+m+1) ( deco p (k+m) (xs ++ xs ’ ) + 

deco p (k+m) (ys ++ ys ’ ) + c ) 

= I by Lemma 3.2 ]} 

code p (k+m+1) 

( (deco p k xs) * p"m + deco p m xs ’ + 

(deco p k ys) * p"m + deco p m ys ’ + c ) 

= -{[ arithmetic ]} 

code p (k+m+1) 

((deco p k xs + deco p k ys + d) * p"m + r) 
where (d,r) = (z ‘div‘p"m, z ‘mod' p"m) 
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z = deco p m xs ’ + deco p m ys ’ + c 

= {[by Lemma 3.2, property 3 ]} 

code p (k+1) (deco p k xs + deco p k ys + d) ++ 
code p m r 

where (d,r) = (z ‘div‘p"m, z ‘mod' p"m) 

z = deco p m xs ’ + deco p m ys ’ + c 

= {[by Lemma 3.2, property 3 with j = l [[ 

code p (k+1) (deco p k xs + deco p k ys + d) ++ us 
where (d:us) = code p (m+1) 

(deco p m xs ’ + deco p m ys ’ + c) 

= {[ fold cadd [} 

cadd p k (xs ++ ys ++ [d] ++ us) 

where (d:us) = cadd p m (xs’ ++ ys ’ ++ [c] ) 

= {[ fold splice [} 

splice (2*k) (cadd p k) (cadd p m) (zs ++ zs’++ [c] ) 

□ 

Note that this proof has been performed at the specification level and hence 
holds for all correct implementations of cadd, not just the carry ripple adder! 
This allows modular decomposition of large adders into a (carry ripple) splicing 
of smaller ones, say 4-bit modules, which may even be heterogeneous. Again the 
associativity of splicing is essential here. Here we have a typical combination of 
parameterisation and modularisation. 

It should also be noted that 

fa p [x,y,c] = cadd p 1 [x,y,c] 

so that the carry ripple design can also be seen as the result of an iterated 
application of Lemma 4.1. 

4.4 Abstraction 

We now review the derivation to find the algebraic laws that were used in it. We 
abstract from the particular case of addition and define a general function 

digrep :: (Int -> Int -> [Int] -> [Int] ) -> 

Int -> Int -> [Int] -> [Int] 

The idea is that, given a function f : : Int -> Int -> [Int] -> [Int] , the 
module digrep f p k (zs ++ [c] ) takes a list zs, that consists of the shuffled 
p-adic representations of two natural numbers m and n, and a “carry” c. From 
these inputs it computes a p-adic representation of the value f applied to m, n 
and c. Besides its “proper” parameters m, n and c, the function f takes into 
account the base p and number k of relevant digits. Assuming that digits p 
(2*k + 1) (zs ++ [c]) holds, we specify 
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digrep f p k (zs ++ [c] ) = 

f p k [deco p k (evns zs) , deco p k (odds zs) , c] 

To retrieve the adder function, we have to set, for m,n ‘below' p"k, 

f p k [m,n,c] = code p (k+1) (m+n+c) (*) 

For the base case k=0 we calculate 

digrep f p 0 [c] 

= -{[ unfold digrep ]} 

f p 0 [deco p 0 [] , deco p 0 [] , c] 

= unfold deco J} 

f p 0 [0, 0, c] 

For the inductive case we could now also replay the derivation of cadd for digrep. 
However, as the remark at the end of Section 4.3 shows, it is more advanta- 
geous to head for a decomposition property of digrep. By analysing the proof 
of Lemma 4.1 we can find a sufficient condition on f that makes the proof go 
through in general. Following [17] we call f factorizable if 

f p (j+k) [m*p"k+q, n*p"k+r, c] = 

((f p j) ‘splice 2‘ (f p k)) [m,n,q,r,c] 

holds for all natural numbers j ,k,m,n,p,q,r. Now Lemma 4.1 generalises to 

Theorem 4.2 (Factorization Theorem). Let f be factorizable. Then 

digrep f p (k+m) = 

(digrep f p k) ‘splice (2*k) ‘ (digrep f p m) 

Proof. digrep f p (k+m) (zs ++ zs’ ++ [c] ) 

= -{[ unfold digrep ]} 

f p (k+m) [ deco p (k+m) (xs++xs’), 

deco p (k+m) (ys++ys’), c ] 

= -{[ unfold deco ]} 

f p (k+m) [ (deco p k xs)*p"m + deco p m xs ’ , 

(deco p k ys)*p"m + deco p m ys ’ , c ] 

= -{[ factorizability ]} 

splice 2 (f p k) (f p m) 

[ deco p k xs , deco p k ys , 

deco p m xs ’ , deco p m ys ’ , c ] 

= {[ unfold splice ]} 

f p k [deco p k xs , deco p k ys , d] ++ us 
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where (d:us) = f p m [deco p m xs ’ , deco p m ys ’ , c] 

= -{[ fold digrep twice ]} 

digrep f p k (zs ++ [d] ) ++ us 
where (d:us) = digrep f p m (zs’++[c]) 

= {[ fold splice ]} 

splice (2*k) (digrep f p k) (digrep f p m) 

(zs ++ zs ’ ++ [c] ) 

□ 

This is in fact Hanna’s Factorization Theorem (see again [17]), which gives 
a general scheme for correct implementations of iterative arithmetic circuits. 
The proof of Lemma 4.1 contains a section that uses Lemma 3.2 to show that 
(*) above defines a factorizable f ; the remainder is isomorphic to the proof of 
Theorem 4.2. 

Using this theorem and the fact that digrep fpl = fplwe can unwind 
digrep f p k into a regular layout: 

Corollary 4.3. For k > 0 we have 

digrep f p k = foldrl (splice 2) (copy k (f p 1)) 

Here we use the standard Haskell functions foldrl and copy. The function 
foldrl takes a binary operator and a non-empty list and combines all list ele- 
ments by that operator, associating them to the right. A call copy k x produces 
a list consisting of k copies of x. 

Another instance of digrep is a comparator circuit, described by 
digrep f p k 

where f p k [m,n,c] = [eq m n /\ c] (**) 

Here, 

eq m n = if m == n then 1 else 0 
b /\ c = b*c 

so that we have numerical representations of the usual Boolean operations. It is 
straightforward to show that the function f in (**) is indeed factorizable. To 
obtain a comparator circuit, we have to instantiate c appropriately, viz. by the 
neutral element 1 of /\, and to unwind the specification using the Factorization 
Theorem. This results in 
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4.5 Re-Use of a Design: Successor (Counting) 

Next we want to derive a counter circuit, ie. an implementation of the successor 
function on digit representations. The specification reads 

succ : : Int -> Int -> [Int] -> [Int] 

succ p k xs = code p (k+1) (deco p k xs + 1) 

This is quite similar to the adder specification. We therefore try to re-use the 
adder design. Formally we need to reduce succ to add; this is done by making 
the hidden neutral element 0 of addition visible so that we have a second operand 
for addition. We calculate: 

succ p k xs 
= unfold succ J} 

code p (k+1) (deco p k xs + 1) 

= neutrality of 0 ]} 

code p (k+1) (deco p k xs + 0 + 1) 

= {[ fold deco ]} 

code p (k+1) (deco p k xs + deco p k (copy k 0) + 1) 

= -{[ fold cadd ]} 

cadd p k (shuf k (xs ++ copy k 0) ++ [1]) 

Although this is a first correct implementation, it is too inefficient. The fact that 
in the unwound version we have calls of the form fa [x,0,c] may be used to 
simplify the design. Define an auxiliary function 

ha [x,c] = fa [x,0,c] = [(x+c) ‘div‘ p , (x+c) ‘mod' p] 

Of course, ha is the half adder function. But again it has been introduced purely 
formally. 

The simplified design looks as follows: 




4.6 Specialisation: Base 2 

For p=2 we obtain the usual representations 
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ha 2 [x,c] = [x /\ c, X >< c] 

fa 2 [x,y,c] = [ d \/ e, z] 

where [d,u] = ha 2 [x,y] 
[e,z] = ha 2 [u,c] 




z 



Here, /\, \/ and >< are again the arithmetic representations of the Boolean 
operations on base 2 digits with >< denoting exclusive or. 

4.7 The Carry Lookahead Adder and Hybrid Adders 

It is well known that the carry ripple adder is time-inefficient, since the length 
of the longest path through the design (along which the carries ripple) is propor- 
tional to the number of digits processed. So there have been various proposals to 
speed up the carry computation. One idea is to compute the carries in parallel 
with the sums; this leads to the carry lookahead adder which we want to derive 
formally now. 

Let the modules in the carry ripple adder be numbered from the right starting 
with 0 and let x i , y i and c i be the i-th input digits and carries (where c 
0 is some given value). 

From the carry ripple design we read off the recurrence equation 

c (i+1) = (p i /\ c i) \/ g i where 

(g i, p i) = (x i /\ y i, X i X y i) 

By usual techniques for solving recurrences we obtain a closed form for the 
carries: 

c (i+1) = 

foldrl (\/) [ ( foldrl (/\) [ p k | k <- [j+l..i] ) /\ g j I 
j <- ] 

where g (-1) = c 0 

For reasons of space we draw the picture of the carry lookahead computation 
only for 3 digits: 
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Using this form of carry computation results in a circuit in which the path 
length is independent of the number of digits processed. This gain is bought at 
the expense of a maximal fan-in that is proportional to the number of digits. So 
for electrical reasons this design is meaningful only for small numbers of digits, 
say 4 or 8. But from our above decomposition property we know that we may 
connect several carry lookahead adders in a carry ripple fashion to obtain a 
correct adder which will then be faster by a factor 4 or 8 than the original pure 
carry ripple adder. 

5 More About Wiring 

So far we have mostly described connections using the rubber view of wires 
(“logical connection”). We now sketch how to step from the logical connection 
to a topology with rigid wires, crossings and fan-out. 

Note, however, that many approaches start at this level and have to carry the 
complications of wiring all through the derivation. This is tedious and obscures 
the essential steps. 

5.1 Basic Wiring Elements 

The basic wiring elements are a straight wire, modeled by the identity function, 
the fan-out of degree 2 (fork), the crossing (swap) and the sink: 

id [x] = [x] 

fork [x] = [x,x] 

swap [x,y] = [y,x] 

sink [x] = [] 
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These operations can be extended to wire bundles in a straightforward way: 
bfork m n xs 

I length xs == n = foldr (++) [] (copy m xs) 

— undefined otherwise 




bswap m n xs 

I length xs == n = drop m xs ++ take m xs 
— undefined otherwise 




The identity id is predefined polymorphically by id y = y and hence does 
not need to be extended to wire bundles. The sink can be handled by setting 
generally sink xs = [] . We will discuss other versions later. 

Finally, we have the invisible module ide with 0 inputs and 0 outputs: 

ide [] = [] 



5.2 Sequential and Parallel Composition 

Sequential composition simply is reverse function composition. We are a bit 
sloppy here about the arities of the functions and simply define 

(f |> g) xs = g (f xs) 

ii 



g 



For parallel composition we need to supply the respective operator with the 
number k of inputs to be routed to the left module; the remaining ones are 
routed to the right module: 
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par k f g xs = f (take k xs) ++ g (drop k xs) 



By the definition of take and drop this works even if k > length xs: in that 
case f gets the full list xs whereas g only gets the empty list [] . Here is the 
diagram for par k f g: 



Conventional polymorphism is too weak to model parallel composition more 
adequately. To avoid the necessity of uniformly typing all list elements one would 
need an extension to “tuples as first-class citizens” with concatenation of tuple 
types and also of tuples as primitive operations. However, this might lead to 
problems with automatic type checking. Therefore we have chosen the simple 
approach above. 

We abbreviate par 1 by the infix operator | | | . 



5.3 Basic Laws (Network Algebra I) 

All semantic models for graph-like networks should enjoy a number of natural 
properties which reflect the abstraction that lies in the graph view. A systematic 
account of these properties has been given in [38] . 



Associativity: 

f |> (g |> h) = (f |> g) |> h 

(f ‘par m‘ g) ‘par (m+k) ‘ h = f ‘par m‘ (g ‘par k‘ h) 



Abiding Law I: 

((f I > g) ‘par m‘ (h |> k)) xs = 
((f ‘par m‘ h) |> (g ‘par n‘ k)) xs 
<== n = length (f (take m xs)) 
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Neutrality: 

id|>f = f = f|>id 

(f ‘par m‘ ide) xs = f xs <== length xs = m 
ide ‘par 0‘ f = f 



Involution: 

(swap I > swap) xs = xs <== length xs = 2 

Whereas associativity and abiding just allow “parenthesis-free layouts” , use of 
neutrality or involution means simplification/complexification of abstract lay- 
outs. 

5.4 Selection 

Using parallel composition we can now give alternative definitions for block 
identity and sink: 

bid n = f oldr (Ml) ide (copy n id) 



bsink n = foldr (I I I) ide (copy n sink) 




Based on this we define selection nets: 
sel n i j = 

(bsink i) ‘par i‘ ((bid j) ‘par j‘ (bsink (n-j-i))) 
for 0 <= i <= n and 0 <= j <= (n-i). 




We have the following fusion rule: 

(bfork 2 n) |> ((sel n i j) ‘par n‘ (sel n (i+j) k)) = 

sel n i (j+k) 
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5.5 Recursions for the Bundle Operations 

Using sequential and parallel composition we can reduce the bundle operations 
to the primitives. 

Example 5.1. The bundle operation bswap m n swaps its first m inputs with 
the following n ones. It is defined by the equations 

bswap m 0 = ide 

bswap 0 n = id 
bswap 11 = swap 

bswap k (k+m+n) = ((bswap k (k+m)) ‘par (k+m) ‘ (bid n)) |> 

((bid m) ‘par m ‘(bswap k (k+n))) 

□ 




5.6 Combinator Abstraction 

We have already discussed the need to pass from rubber wiring to rigid wiring. 
This is achieved by eliminating all formal parameters from functional expressions 
in favour of parallel and sequential composition and the basic wiring elements. 
This is analogous to the process of A-abstraction in combinatory logic (see eg. 
[1]). Therefore we term this operation combinator abstraction. 

We use the following simplified syntax for Haskell expressions: 

expr ::= parid | (opid expr ... expr) | (expr ++ ... ++ expr) 

Here we suppose that parid produces the admissible identifiers for formal pa- 
rameters, whereas opid produces the admissible operators such as div or (+). 

For each expression E we now want to construct a composed module CA E (CA 
stands for combinator abstraction) that computes the corresponding function on 
a list of inputs to produce a list of outputs. 

For its definition, we need the list ID E of the formal parameters occurring 
in expression E. This list is organised in textual order of appearance of the 
parameters and kept repetition free. It is inductively defined as follows: 

ID X = [x] 

ID (op el ... en) = remdups ([op] ++ ID el ++ ... ++ ID en) 

ID (el ++ ... ++ en) = remdups (ID el ++ ... ++ ID en) 
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Here remdups removes duplicates, preserving the leftmost occurrence of each 
item in a list: 

remdups [] = [] 

remdups (x : xs) = x : remdups [y | y <- xs, y /= x] 

Suppose now an expression E all formal parameters of which occur in the 
repetition free list [xO, . . . ,xn-l]. Then the combinator abstraction CA E of 
E will take a list of n inputs to produce a corresponding list of outputs. It is 
defined by induction over the structure of E. To ease the definition, we introduce 
an auxiliary definition of CA op where op is an opid. 

CA xi = sel nil 

CA op = \xs -> [ op (xs!!0) ... (xs!!(k-l))] 

if op : : to -> . . . tk-1 -> t 

CA (op El ... Em) = 

(CA (El ++ ... ++ Em)) |> CA f 

CA (El ++ . . . ++ Em) = 

bfork m n |> (CA El ‘par n‘ ( ... ‘par n‘ CA Em)) 

Example 5.2. Suppose E = ( [x /\ y] ++ [y >< x]). Then 

CA E = 

bfork 2 2 I > 

(bfork 2 2 |> ((sel 0 1 ‘par 2‘ sel 11) I > CA /\) ‘par 2‘ 

bfork 2 2 |> ((sel 1 1 ‘par 2‘ sel 0 1) |> CA ><)) ) 



X y 




This can, of course, be simplified to 

bfork 22 |> ((bid 2 | > CA /\) ‘par 2‘ (swap |> CA ><)) 



X y 




□ 
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The basic rules above lead to circuits involving very high fan-outs. More 
refined rules avoid this, e.g. 

CA (E ++ F) = CA El 'par k' CA En 

if ID (E ++ F) = ID E ++ ID F, ie. if the sublists of formal parameters are 
disjoint and in order, and k = length (ID E). 

5.7 A Further Example: Shuffling 

Recall the specification of the shuffle operation from Section 4.2: 

(shuf k xs) ! ! (2*i) = x ! ! i 

(shuf k xs) ! ! (2*i+l) = x ! ! (k+i) 

for length xs == 2*k and i <- [0. .k-1]. 

Some calculation yields the following inductive version: 

shuf 0 = id 

shuf 1 = id 

shuf (k+1) = 

(id ‘par 1' ((cshifti k) ‘par k‘ id))) |> 

(id ‘par 2‘ (shuf k)) 

cshifti k = foldrl (splice 1) (copy k swap) 




For further details on wiring we refer to [19,31]. 



Part II: Sequential Hardware 

6 A Model of Streams 

A frequently used model of sequential hardware is that of stream transformers. 
Streams are used to model the temporal succession of values on the connection 
wires, whereas the modules are functions from (bundles of) input streams to 
(bundles of ) output streams. 

In this paper we deal with discrete time only. Even this leaves several options 
how to represent streams. One possibility would be to define 
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type Stream a = [a] 

Since Haskell employs a lazy semantics, this allows finite as well as infinite 
streams. Time remains implicit, but can be introduced using the list indexing 
operation. 

We use a version which explicitly refers to time: 

type Time = Int 

type Stream a = Time -> a 

This will carry over easily to real time. On the other hand, this does not directly 
support finite streams. They would have to be modeled by functions that become 
eventually constant, preferably yielding only the pseudo- value undefined after 
the “proper” finite part. 



7 Networks 



Again we model bundles of inputs and outputs by lists, this time of streams. 
By polymorphism we can re-use all our connection primitives, such as |>, par, 
fork, swap and splice and their laws for stream transformers. 

Our diagrams will now be drawn sideways: 



f I > g 




par k f g 



k 




fork 



swap 




The input /output streams are numbered from bottom to top in the respective 
lists. 

7.1 Lifting and Constant 

To establish the connection with combinational circuits we need to iterate their 
behaviour in time. To this end we introduce liftings of operations on data to 
streams. A “unary” operation takes a singleton list of input data and produces 
a singleton list of output data. This is lifted to a function from a singleton list 
of input streams to a singleton list of output streams. It is the analogue of the 
apply-to-all operation map on lists. We define 
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liftl : : (a -> b) -> [Stream a] -> [Stream b] 
liftl f [d] = [\t -> f (d t)] 

Alternatively, since streams are functions themselves, the lifting may also be 
expressed using function composition: 

liftl f [d] = [f.d] 

Similarly, we have for binary operations 

lift2 : : (a -> a -> b) -> [Stream a] -> [Stream b] 
lift2 g [d,e] = [\t -> g (d t) (e t)] 




The inscriptions of the boxes follow notationally the view of infinite streams 
of functions used in [29] . 

Another useful building block is a module that emits a constant output 
stream. For convenience we endow it with a (useless) input stream. So this 
module actually is a combination of a sink and a source. We define 

cnst : : a -> [Stream b] -> [Stream a] 
cnst X = liftl (const x) 




Here const is a predefined Haskell function that produces a constant unary 
function from a value. 



7.2 Initialised Unit Delay 

To model memory of the simplest kind we use a unit delay module. Other delays 
such as inertial delay or transport delay can be modeled similarly. For a value x 
the stream transformer (x >-) shifts its input stream by one time unit; at time 
0 it emits x as the initial value: 

(>-) : : a -> [Stream a] -> [Stream a] 

(x >- [d]) = [e] 

where e t | t == 0 = x 

I t > 0 = d (t-1) 



X 

> 



> 



d 
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We now state laws for pushing delays through larger networks. They allow 
for each circuit constructor to shift delay elements from the input side to the 
output side or vice versa under suitable change of the initialisation value. These 
laws are used centrally in our treatment of systolic circuits. 

Lemma 7.1 (Delay Propagation Rules). Iff is strict, ie. undefined when- 
ever its argument is, then 

(x>-) |> liftl f = liftl f |> ((f x) >-) 

If g is doubly strict, ie. is undefined whenever both its argument are, then 
((x>-) Ml (y>-)) l> lift2 g = lift2 g |> ((g x y)>-) 
Moreover, 

(x>-) |> cnst y = cnst y |> (y>-) 

((x>-) Ml (y>-)) l> swap = swap |> ((y>-) Ml (x>-)) 

(x>-) |> fork = fork |> ((x>-) | | | (x>-)) 

These rules can be given in pictorial form as 




For propagation through | > and par we may use associativity of I > and the 
abiding law. 

These simple laws are quite effective as will be seen in later examples. In our 
derivation of systolic circuits we actually use them in the direction from right to 
left to shift delays from outputs to inputs. 

8 Example: The Single Pulser 

To show our algebra at work we will treat a single pulser. This is another of 
the IFIP verification benchmarks [20]. The informal specification requires the 
module to emit a unit pulse whenever a pulse starts in its input stream. 
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8.1 Formal Specification 

We model this by a transformer of streams of Booleans. A pulse is a maximal time 
interval on which a stream is constantly True. First we formally characterise 
those time points at which a pulse starts by 

startPulse : : Stream Bool -> Time -> Bool 
startPulse d t = d t && ( t==0 I I not(d (t-1)) ) 

Note that by Time -> Bool = Stream Bool we may view startPulse also as 
a stream transformer. 

Now we can give the formal specification of the pulser: 
pulser [d] = [ \t -> startPulse d t ] 

Equivalently, by extensional equality, 
pulser [d] = [ startPulse d ] 

8.2 Derivation of a Pnlser Circnit 

For t = 0 we calculate 
startPulse d 0 
= -{[ unfold startPulse ]} 

d 0 && ( 0==0 I I not (d (0-1)) ) 

= -{[ Boolean algebra ]} 

d 0 

For t > 0 we have 

startPulse d t 
= unfold startPulse ]} 

d t && ( t==0 I I not (d (t-1)) ) 

= -{[ t > 0 and Boolean algebra ]} 

d t && not (d (t-1)) 

= 1 fold >- n 

d t && not (e t) where [e] = x >- [d] . 
for arbitrary x. Now we try to choose the initialisation value x such that 
startPulse d t = d t && not (e 0) 
holds also for t=0, ie. 
d 0 = 



d 0 && not X 
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This is satisfied for all values d 0 iff x = False. 

Now combinator abstraction and simplification yields 

pulser = fork |> ( ((False >-) |> liftl(not)) | | | id ) 
|> lift2 (&&) 



►q 






, False 

> 



9 Feedback 

For describing systems with memory we need another essential ingredient. It is 
the very general concept of feeding back some outputs of a module as inputs 
again. This allows, in particular, the preservation of a value for an arbitrary 
period, ie. storing of values. 



9.1 The Feedback Operation 

Given a module f : : Module the module feed k f results from f by feeding 
back the last k outputs to the last k inputs: 

feed : : Int -> Module -> Module 



feed k f xs 
where ys 



codrop k ys 
f (xs ++ cotake k ys) 



cotake n xs 
codrop n xs 



drop (length xs - n) xs 
take (length xs - n) xs 




ys 



Note the recursive definition of ys that reflects the flowing back of informa- 
tion. This recursion is well-defined by the lazy semantics of Haskell. 

9.2 Properties of Feedback (Network Algebra II) 

The feedback operation enjoys a number of algebraic laws which show that it 
models the rubber wire abstraction correctly. For a systematic exposition see 
again [38]. 
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Stretching wires: 

Assume that f is defined only for input lists of length m and that g always 
produces output lists of length n+k. The we have 

f |> feed kg l> h = feed k ((f ‘par m‘ id) |> g |> 

(h ‘par n‘ id) ) 




Abiding law II: 

f ‘par n‘ feed kg = feed k (f ‘par n‘ g) 




Shifting a modnle: 

Assume that f is defined only for input lists of length m+k and always produces 
output lists of length n+k and that g is defined only for input lists of length k 
and always produces output lists of length k. Then we have 

feed k (f |> (id ‘par n‘ g)) = feed k ((id ‘par m‘ g) |> f) 
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9.3 Interconnection (Mntnal Feedback) 

In more complex designs it may be convenient to picture a module f with inputs 
and outputs distributed to both sides: 



f 



We want to compose two such functions to model interconnection of the 
respective modules. To this end we introduce 

connect : : Int -> Int -> Int -> Module -> Module -> Module 

The three Int-parameters in connect k m n f g are used similarly as for splic- 
ing: they indicate that k inputs are supposed to come from the left neighbour of 
f, that m wires lead from f to g, and that n outputs go to the right neighbour 
of g. 




We define therefore 

connect k m n f g xs = take n zs ++ drop m ys 

where ys = f (take k xs ++ drop n zs) 

zs = g (take m ys ++ drop k xs) 

This involves a mutually recursive definition of ys and zs which again is well- 
defined by the lazy Haskell semantics. 

Lemma 9.1. Interconnection is associative in the following sense: 

(f ‘connect k m n‘ g) ‘connect k n p‘ h = 
f ‘connect k m n‘ (g ‘connect k m p‘ h) 




Moreover, connect has the identity id as its neutral element. 
Two interesting special cases are 



connect 1 1 1 f g 
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f 




g 






■W 











and 
f =1 



connect 1 1 0 f g 




The symbols have been chosen such that they indicate the places of the 
external wires of the resulting circuits: whereas f = I I = g has external wires on 
the left and on the right, f = I g has them only on the left. The operator = | | = is 
also known as mutual feedback ® (see eg. [4]). The corresponding network can 
be depicted as 




Using a suitable torsion of the network we can relate interconnection to 
feedback: 

f =1 1= g = 

feed 1 ( (id | | | swap) |> 

(f ‘par 2‘ id) |> (id | | | swap) |> 

(g ‘par 2‘ id) |> (id ||| swap) ) 




With the help of this connection, the proof of associativity of connect can 
be given using purely the laws of network algebra. Hence the lemma is valid for 
all models of network algebra, not just our particular one. 
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In special cases the interconnection reduces to simple sequential composition, 
thus eliminating the internal feedback loop. As an example we state 

Lemma 9.2. Suppose that f has the speeial form 

f = (fork I I I id) I > (id I I I h) 

Then 

f =1 g = fork |> (id I I I g) l> h 
Pictorially, 




10 A Convolver 

We want to tackle a somewhat more involved example now. In particular, we 
want to prepare the way to systolic circuits. We treat a convolver, another of 
the IFIP verification benchmarks [20] . 

A non-programmable convolver of degree n uses n fixed weights to compute 
at each time point t >= n the convolution of its previous n inputs by these 
weights. Mathematically the convolution is defined as 

n 

Wn-i * d{t - i) , 

where d is the input stream and the Wj are the weights. Convolution is used eg. 
in digital filters. 

10.1 Specification 

Using list comprehension, the above mathematical definition can be directly 
transcribed into a Haskell specification. For convenience we collect the weights 
into another stream w. Then the convolver is specified by 

conv : : Stream Int -> Int -> [Stream Int] -> [Stream Int] 
conv w n [d] = 

[ \t -> if t < n 

then undefined 

else sum [ w (n-i) * d (t-i) | i <- [l..n] ] ] 

It should be clear that the problem generalises to arbitrary compositions of fold 
and apply-to-all operations. Since we have taken such an abstraction step already 
in Section 4.4, we do not want to repeat this here. 
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10.2 About Error Handling 

We have modeled non-initialisation by the pseudo- value undefined. However, 
it turns out that the only essential assumption about undef ined that goes into 
the derivation is the strictness property 

X + undefined = undefined 

This could also be achieved by introducing an additional error element using 
Haskell’s facilities for defining variant record types and adapting addition ac- 
cordingly: 

data Error a = Proper a | Err 
instance Num a => Num (Error a) where 
Proper x + Proper y = Proper (x+y) 

_ + _ = Err 

Similarly definitions would be given for the other arithmetic operations. Since 
this is somewhat cumbersome, though, we have chosen the above method. 



10.3 Derivation of a Convolver Circuit 

We now want to derive from the formal specification a regular layout described, 
as in the case of the adder, by a recursion. The obvious parameter to drive the 
recursion is the number n of terms in the sum, since the summation function 
is defined recursively itself and we can carry over its recursion structure to the 
convolver circuit. 

The base case is n = 0. For t >= 0 and [e] = conv w 0 d we calculate 
e t 

= -{[ specification of e ]} 

sum [ w (0-i) * d (t-i) | i <- [1..0] ] 

= -{[ definition of intervals ]} 

sum [ w (0-i) * d (t-i) I i <- [] ] 

= -{[ definition of list comprehension ]} 

sum [] 

= -{[ definition of sum ]} 

0 

Hence conv 0 = cnst 0 with cnst defined as in Section 7.1. 

We now perform the induction step. For t >= n+1 and [e] = conv w (n+1) 
d we obtain 
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e t 

= {[ specification of e ]} 

sum [ w (n+l-i) * d (t-i) | i <- [l..n+l] ] 

= -{[ splitting the interval, definition of sum J} 

w n * d (t-1) + sum [ w (n+l-i) * d (t-i) | i <- [2..n+l] ] 

= -{[ index transformation ]} 

w n * d (t-1) + sum [ w (n+l-(j+l)) * d (t-(j+l)) | j <- [l..n] ] 
= -{[ arithmetic ]} 

w n * d (t-1) + sum [ w (n-j) * d (t-l-j) | j <- [l..n] ] 

= -{[ fold conv ]} 

w n * d (t-1) + c (t-1) 
where [c] = conv w n d 

Now combinator abstraction, together with Lemma 9.2, yields 

conv w (n+1) = (cell w n) =| (conv w n) 

cell w k = (fork | | | id) |> (id | | | h) 

h w k = (liftl ((w k) *) III id) |> 

(lift2 (+)) |> (undefined >-) 



This recursive formation law for the basic convolver can be depicted as follows, 
where T stands for undefined: 




10.4 Unwinding the recnrsion 

For fixed n > 0 we obtain again a regular design: 
conv w n = 

(foldrl (=11=) [ cell w k | k <- [l..n] ]) =| cnst 0 
After simplification of the rightmost cell this yields the design 








Deductive Hardware Design: A Functional Approach 



457 




10.5 Towards a Systolic Version 

A circuit is combinational if it uses only lifted operations and sequential or 
parallel composition. In a clocked circuit, the clock period is determined by the 
stabilisation time of the circuit which depends on its longest combinational path. 

In systolic circuits one tries to minimise the clock period by making the 
combinational modules involved quite small. Then the clock period can be kept 
relatively short, namely it can be taken as the maximum of the stabilisation 
times of the combinational submodules involved. Since there is, however, no 
general rule for calling a combinational module “small” the precise definition of 
systolism avoids such a notion. Rather, a circuit is called systolic (cf. [24, 25, 14]) 
if there is at least one delay element along every connection wire between any of 
its combinational modules. A related but somewhat different notion of systolism 
is used in the field of massively parallel systems; however, there no explicit delay 
elements are employed. 

We want to obtain a systolic version of our convolver. Hence we have to 
introduce additional delay elements. 

11 Speedup by Slowdown 

The technique to introduce delays formally is slowdown (see e.g. [24,25,21]). 
The k-fold slowed down version of a circuit works on k interleaved streams. So 
each of these is processed at rate k slower than in the original circuit. 

11.1 Interleaved Streams 

To talk about the component streams of such a “multistream” we introduce 
split k j d t = d (k*t + j) 

So split k j d is the j-th of the k component streams where numbering starts 
with 0 again. Eg. split 2 0 d and split 2 1 d consist of the values in d at 
even and odd time points, respectively. Then d can be considered as an alter- 
nating interleaving of these. The interleaving of k=4 streams may be depicted as 
follows: 
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The following properties of split are useful for proving the slowdown prop- 
agation rules below: 

Lemma 11.1. We have 

(x>-) |> split k 0 = (split k (k-1)) |> (x>-) 

(x>-) |> split k j = split k (j-1) (0 < j < n) 

To interleave k streams from a list ss we use 

ileave k ss t = (ss ! ! (t ‘mod' k))(t ‘div‘ k) 

Provided that length ss >= k, we have 

split k j (ileave k ss) = ss ! ! j 
A special case is the interleaving of k copies of the same stream: 
rep k d = ileave k (copy k d) 

The above property yields 

split k j (rep k d) = d 

11.2 The Slowdown Function 

Now the slowdown function is specified implicitly by 

(slow k f ) |> split k j = (split k j) l> f 

Here f is an arbitrary function on streams, not just a lifted unary operation. 
In particular, f may look at all the history of a stream. By this definition, 
slow k f s may be considered as splitting s into k substreams, processing these 
individually with f and interleaving the result streams back into one stream. 
From the specification the following proof principle is evident: 

Lemma 11.2. If for a funetion h and all j in [1. .k] we have 
h |> split k j = (split k j) |> f 
then h = slow k f . 

For easier manipulation we want to obtain an explicit version of slow. Since 
by definition of split 

split k j (slow k f s) t’ = slow k f s (k*f + j) (*) 



we have conversely 
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slow k f s t 

= -{[ definition of ‘div‘ and ‘mod' ]} 

slow k f s (k* (t ‘div‘ k) + t ‘mod' k) 

= -II by (*) } 

split k (t 'mod' k) (slow k f s) (t 'div' k) 

= -{[ specification of slow ]} 

f (split k (t 'mod' k) s) (t 'div' k) 

In sum, 

slow k f s t = f (split k (t 'mod' k) s) (t 'div' k) 



11.3 Slowdown Algebra 

The function slow distributes nicely through our circuit building operators: 



Lemma 11.3. We have 

slow k (x >-) = 

slow k (cnst x) = 

slow k (f I > g) = 

slow k (f I I I g) 
slow k (f =1 1= g) 
slow k (f = I g) = 
slow k (feed m f) = 



foldr (|>) id (copy k (x >-)) 
cnst X 

slow k f |> slow k g 
slow k f III slow k g 
slow k f =11= slow k g 
slow k f =1 slow k g 
feed m (slow k f) 



This means that the k-fold slowed down version of a circuit results by replac- 
ing each delay element by k identical delay elements. A further useful propagation 
law for slow is given by 

Lemma 11.4. Suppose that (x>-) |> f = f |> (y>-). Then also 
(x>-) |> slow k f = (slow k f) |> (y>-) 



12 A Systolic Convolver: The 2-Slow Convolver 

Using k-fold slowdown we can interleave k streams or pad one stream with 
dummy elements by merging it with constant streams of dummies. The latter 
approach is usually taken in verification approaches to the systolic convolver: 
one uses slowdown by 2 and is interested only in the stream values at odd time 
points; at even time points eg. the value 0 is used. 

We want to derive a systolic convolver by deductive design. We leave the de- 
cision whether to use proper interleaving or padding open; both can be achieved 
by suitable embeddings of the original conv function into the slowed down one 
defined by 
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sconv w n = 
(foldrl (= 
scell w k = 
h w k = 



1=) [scell w k I k <- [l..n]]) 
(undefined >-) |> (fork | | | id) 
(liftl ((w k) *) I I I id) |> 
(lift2 (+)) |> (undefined >-) 



= I cnst 0 
l> (id I I I h) 



This simplifies into 




Of course, the techniques we have developed do not only apply to the con- 
volver, but are of general interest for the derivation of systolic implementations 
of circuits. As a further case study, a systolic recogniser for regular expressions 
is developed in [30] . 

The derivations have been fairly short; the underlying technique is applica- 
ble quite generally. Moreover, the semantical basis is simple. So our approach 
compares favourably with verification approaches in this area (see eg. [27,39]). 

13 Pipelining 

As a final example we want to leave the level of circuits and step up to questions 
about microprocessor architectures. To exemplify our approach in that area, we 
give a brief account of the essence of pipelining. 

We use a set a of instruction addresses, a set i of instructions and a set s of 
machine states. Assume, moreover, a function 

fetch : : a -> s -> i 

that obtains the instruction stored under an address in the current state and a 
function 

exe : : i -> s -> s 

for executing an instruction in a state to yield a new state. The fetch/execute- 
cycle of a machine can then be defined by the function 

run : : [a] -> s -> s 

run [] q = q 

run (x : xs) q = run xs (exe (fetch x q) q) 
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We now want to uncouple the fetch and execute phases so that they can be done 
in parallel. This is done by a suitable embedding into a function that has as 
parameters an instruction to be performed currently and a list of addresses of 
further instructions: 

pipe : : [a] -> i -> s -> s 

pipe xs j q = run xs (exe j q) 

The original function run is reduced to pipe by the equations 

run [] q = q 

pipeline exhausted 

run (x : xs) = pipe xs (fetch x q) q 

put first instruction into pipeline and run that 

The goal is now a version of pipe that is independent of run. 

As the termination case we obtain 

pipe □ j q = exe j q 

To derive the recursive case we need the central assumption for the correct- 
ness of the version of pipelining we treat here. We stipulate that execution of an 
instruction does not change the contents of the program memory. This means 
that the program has to be kept in a part of the memory that is administered 
in a read-only fashion. This assumption can be expressed formally as 

fetch a (exe j q) = fetch a q (*) 

for all a , j , q. With this assumption we calculate 

pipe (x : xs) j q 

= {[ unfold pipe ]} 

run (x : xs) (exe j q) 

= {[ unfold run ]} 

run xs (exe (fetch x q’) q’) 
where q’ = exe j q 

= {by (*) ]} 

run xs (exe (fetch x q) q’) 
where q’ = exe j q 

= { fold pipe ]} 

pipe xs (fetch x q) (exe j q) 

This means that fetching the next instruction can be done in parallel with exe- 
cuting the current one. 

Note that the derivation is completely polymorphic; no assumptions are made 
about the types a, s and i. The only assumption is the validity of equation (*) . 
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In particular, the transformation can be iterated to obtain pipelines with several 
stages if exe can be decomposed into further subfunctions. 

Further investigations here will concern transformational derivations of real- 
istic pipeline processors such as formally treated in [8]. 

14 Conclusion 

We have seen a number of essential ingredients of deductive hardware design: 

— algebraic reasoning, 

— parameterisation, 

— modularisation, 

— re-use of designs and derivations, 

— precise determination of initialisation values. 

Special emphasis was laid on parameterisation and re-usability aspects. 

The major novel ingredients and achievements in this work are the following: 

— specification at the level of predicate logic, not necessarily algorithmic yet, 

— a clearer disentangling of the abstract idea of an algorithm from the concrete 
layout that realizes it, 

— in particular, introduction of wiring operators in a late stage of the deriva- 
tions, thus avoiding a lot of burden and clutter, 

— a simpler approach to retiming that avoids the concept of anti-delays. 

The case studies include several of the IFIP WG10.5 Benchmark Verification 
Problems. Next to dealing with basic combinational and sequential circuits, a 
very simple treatment of systolic circuits has been achieved. Finally, concerning 
higher-level hardware concepts, an easy formal account of pipelining became 
possible. 

The switch to formalising the specifications and derivations using the func- 
tional programming language Haskell brought many advantages. The polymor- 
phism of this language allows the use of §tefanescu’s network algebra and other 
algebraic laws uniformly both at the levels of combinational and sequential cir- 
cuits. Also fixpoint induction and related proof principles can be applied directly. 
Moreover, many derivations can be performed in a polymorphic way abstracting 
from concrete applications and hence achieving much better re-usability. Last, 
but not least, the derivations could be formally checked using the ULTRA trans- 
formation system. 

Further elaboration of this approach will mainly concern deductive design in 
the large, asynchronous systems and other notions of time. 
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Appendix: Essential Constructs of Haskell 

Basic Types and Functions 

For those not familiar with Haskell, we briefly repeat its essential elements. Basic 
types are Int for the integers and Bool for the Booleans with elements True and 
False. Exponentiation is written in the form x"n. The operations of conjunction 
and disjunction are denoted by && and I I , resp. These are the semi-strict versions 
evaluating their arguments from left to right, ie. satisfying 

X && y = if X then y else False 

X I I y = if X then True else y 

The type of functions taking elements of type a as arguments and producing 
elements of type b as results is a -> b. The fact that a function f has this type 
is expressed as f : : a -> b. 

Function application is denoted by juxtaposing function and argument, sep- 
arated by at least one blank, in the form f x. Functions of several arguments 
are mostly used in curried form f xl x2 ... xn. In this case f has the higher- 
order type f : : tl -> (t2 -> ... (tn -> t) ...) or, abbreviated, f :: tl 
-> t2 -> ... tn -> t (the arrow -> associates to the right, whereas function 
application associates to the left). 

Functions are defined by equations of the form f x = E or as (anonymous) 
lambda abstractions. Instead of Xx.E one uses the notation \x -> E. 

A two-place function f ::a->b->c may also be used as an infix oper- 
ator in the form x ‘f ‘ y; this is equivalent to the usual application f x y. Eg. 
instead of div x y one may also write x ‘div‘ y. To formulate a number of 
expressions and properties in a more readable way we use a small notational ex- 
tension of this: we also use larger expressions (that do not contain the backquote) 
between backquotes as infix operators. Eg. for 

zipWith : : (a -> b -> c) -> [a] -> [b] -> [c] 

we may then write xs 'zipWith (+) ‘ ys for the componentwise addition of 
lists xs and ys. 

For a binary operator ?, by supplying only one of its arguments one obtains 
a residua] function or operator section of the form 

(x ?) = \y -> X ? y or (? y) = \x -> x ? y 
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Case Distinction and Assertions 

Haskell offers several possibilities for doing case distinctions. One is the usual if 
then else construct. To avoid cascades of ifs, a function may also be defined 
in a style similar to the one used in mathematics. The notation is 

f X 

I Cl = El 

I Cn = En 

The result is the value of the first expression Ei for which the corresponding Ci 
evaluates to True. If there is none, the result is undefined. 

We shall also use this to make functions intentionally partial in order to 
enforce assertions about their parameters (see [28]). 

To avoid partiality one can use the predefined constant otherwise = True 
and add a final clause 

I otherwise = En+1 

Yet another way of case distinction is provided by defining a function through 
argument patterns. Several equations indicate what a function does on inputs 
that have certain shapes. The equations are tried in textual order; if no pattern 
matches the current argument, the function is again undefined at that point. 

Example 14.1. By the equations 

f 0 = 5 
f 1 = 7 

the function f : : Int -> Int is defined only for argument values 0 and 1. □ 



Lists 

The type of lists of elements of type a is denoted by [a] . The list consisting of 
elements xl, . . . ,xn is written as [xl, . . . ,xn] ; in particular, [] is the empty 
list. Concatenation is denoted by ++. Prefixing an element to a list is denoted 
by the colon operator: 

x:xs = [x] ++ xs 

The function length returns the length of a list. The ith element of list xs is 
selected by the expression xs ! ! i (where numbering starts with 0) . 

A list may be split into two parts using the functions 

take, drop :: [a] -> Int -> [a] 

For non-negative integer k the list take k xs consists of the first k elements 
of xs if k <= length xs and of all of xs if k > length xs. For negative k the 
expression take k xs is undefined. The list drop k xs results by removing take 
k xs from the front of xs. Hence one always has 
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take k xs ++ drop k xs = xs 

A very useful specification feature is list comprehension in the form 
[ f X I X <- L, p x] 

where L is a list expression, f some function on the list elements and p a Boolean 
function. The symbol <- may be viewed as a leftward arrow and pronounced as 
“drawn from” or as a form of element sign. In this latter view, the expression 
is the list analogue of the usual set comprehension {f x\x € S,px}. The mean- 
ing of the list comprehension expression [fx I x<-L, px] is again a list, 
constructed as follows: 

— The elements of list L are scanned in left-to-right order. 

— On each such element x the test p is performed. 

— If p X = True, f x is put into the result list. 

— Otherwise, x is ignored. 

The list [m,m+l , . . . ,n] of integers may be denoted by the shorthand [m. .n] . 
The right bound n may be omitted; then the expression denotes the infinite list 
[m, m+1, ... ]. 

A useful operation on non-empty lists is the folding of their elements using 
a binary operator f : : : 

foldrl f [xl,...,xn] = f xl (f x2 . . . (f xn-1 xn) . . . ) 

Eg. foldrl (+) s computes the sum of all elements of s. The function foldrl 
itself has the type (a -> a -> a) -> [a] -> a. 

A variant of foldrl that also copes with empty lists is f oldr; it uses an addi- 
tional argument e that specifies the value for empty lists. The defining equations 
read 

foldr f e [] = e 

foldr f e [x] ++ xs = f x (foldr f e xs) 

Based on foldr one can define a universal quantifier over lists. For a predicate 
p : : a -> Bool one has 

all p xs = foldr (&&) True [p x | x <- xs ] 

So all p xs yields True iff p x yields True for all x in xs. 




